The looming issue in big data isn't technology but the privacy and ethics decisions associated with how, when and if results should be provided.
Big data is a big deal for companies in 2013. The prospect of outdistancing your competition by leveraging your company's data with huge data sources such as NASA, the government, video and demographic services is compelling. But there's evidence that technology is advancing faster than companies and governments can manage it. Along with big data technology developers, your company should be thinking about adding a "big data ethicist."
The need for big data technologists is well covered in the media. At the annual Gartner IT Symposium, I reported on a looming gap between big data needs and technologists to fill those needs:
"The Gartner analysts predicted that by 2015, 4.4 million IT jobs globally will be created to support big data, with 1.9 million of those jobs in the United States. That employment projection carries further weight when, as the Gartner analysts pointed out, each of those jobs will create employment for three more people outside of IT.
"However, while the jobs will be created, there is no assurance that there will be employees to fill those positions. Sondergaard provided the dour prediction that only one-third of the jobs will be filled due to a lack of skilled big data applicants. One of the biggest tasks for CIOs is to rethink how to hire and train a workforce able to meet this demand for big data talent."
As InformationWeek executive editor Doug Henschen explained in an article on the big data talent war, technology executives need to engage a seven-point hiring and training plan for big data professionals.
So the need for technology talent in the big data segment is clear. But recent tragic events also show how big data extends far beyond a company's technology or marketing departments.
Recently, the Journal News, a newspaper based in White Plains New York, touched off a furor when it published a Google map showing the location for 44,000 registered handgun owners in Westchester, Rockland and Putnam counties in New York State. The registration information obtained under the Federal Freedom of Information Act is a vivid example of, as the Christian Science Monitor reported, the disputes that can arise when constitutional rights -- in this case, the First and Second Amendments -- clash.
The tragedy at the Sandy Hook Elementary School was the catalyst for the Journal News' decision to publish the gun owner information. The ease with which the paper published information obtained from the Freedom of Information Act and Google Maps shows how data is becoming more accessible in the big data era. While Putnam County officials have so far resisted providing the newspaper with gun ownership information, it appears they are unlikely to block access going forward.
Marc Parrish continues the discussion of big data's role in Second Amendment rights in The Atlantic. In his article he states, "Big data might have stopped the massacres in Newtown, Aurora, and Oak Creek. But it didn't, because there is no national database of gun owners, and no national record-keeping of firearm and ammunition purchases. Most states don't even require a license to buy or keep a gun.
"That's a tragedy, because combining simple math and the power of crowds could give us the tools we need to red flag potential killers even without new restrictions on the guns anyone can buy. Privacy advocates may hate the idea, but an open national database of ammunition and gun purchases may be what America needs if we're ever going to get our mass shooting problem under control."
While it is beyond the editorial mission of business publications like this one to take a side on the Second Amendment controversy, one technology-related aspect of Parrish's article is undeniably correct: The task of developing a database of 300 million guns and their owners is now so trivial it hardly falls into the big data category.
The looming issue in big data isn't technology but the decisions associated with how, when and if results should be provided. Widespread access to public information, interfaces that make it easy to combine big data sources, and the ability to publish information to the Internet is going to yield some difficult decisions for the big data community.
And those decisions are only going to become more intense. A substantial amount of data by government organizations, for example, is still locked up in paper format. However, companies such as Captricity have developed innovative ways to turn massive amounts of paper-based data into digital form. Companies such as Panjiva are using big data and business intelligence to meld buyer and seller data in multiple public and private databases to create a unique global commerce engine.
In the enthusiasm around big data, there has been little discussion about what that data might uncover. Privacy issues will surface as data analytics becomes able to reveal identities by combining what was previously considered anonymous data with location and purchasing information. Alistair Croll, at O'Reilly Media, put it succinctly in an article entitled "Big Data Is Our Generation's Civil Rights Issue, and We Don't Know It": "Time for you to plan for not just how your big data strategy will be implemented, but what are the implications of the data your company will be creating and publishing."
Cloud computing, virtualization and the mobile explosion create computing demands that today’s servers may not meet. Join Dell executives to get an in-depth look at how next-generation servers meet the evolving demands of enterprise computing, while adapting to the next wave of IT challenges. Register for this Dell-sponsored webcast now.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.