Startups Offer Tools To Simplify Big DataNot every big data project has a data scientist on board. Some startups are developing analysis tools to help non-specialists target the information they need.
Researchers at the University of Wisconsin last month announced a tool that combines two high-profile societal trends: a big data analytic that sifts through the 250 million Twitter messages to identify those belonging to school bullies or their victims.
The software, which is designed to identify after-the-fact posts from bullies, victims, accusers, or defenders, tags about 15,000 messages per day as related to a specific incident of bullying.
White PapersMore >>
- Take the InformationWeek 2013 Database Technology Survey
- Security Implications of Big Data Strategies
The ability to identify specific types of incidents from messages posted by participants outside of school should be valuable to school administrators whose only opportunity for assessing bullying in their schools might be an annual survey and whatever facts they can squeeze out of reticent victims, the researchers said.
Unfortunately for the developers, however, bully identification is not high on the list of priorities for which venture capitalists are willing to fund new companies.
[ Data scientists are in high demand, and the field is expected to grow dramatically. Read more at Data Scientists: Meet Big Data's Top Guns. ]
On the other hand, it's relatively simple to land $2 million in funding for big-data tool development, even if your company has a squirrely name and a shadowy origin as a spinoff of the National Security Agency--that is, as long as your plan is to add enough security to big data apps to let heavily regulated industries like health care and financial services use them.
The company--Sqrrl-- is working on developing security that's granular enough to allow emergency room staff access to a patient's phone number, street address, or list of allergies in an Electronic Health Record, while locking down his or her Social Security number and financial and personal details.
That could create a huge change in the way health care organizations handle digital health records, which currently can be either open or closed and have little ability to hide the most sensitive bits of data. Big data is important, but big analytics are more important--because it's not having big data that matters, but doing something useful with it, according to data- and information-management analyst Colin White of BI Research.
The long-term difference between a company that uses big data effectively and one that ends up wandering in the desert with no idea where to go may be the organization's ability to focus on the workload needed to address a problem rather than focusing on a new technology such as Hadoop, White said.
While there is definitely a gap between corporate executives' awareness of big data and the questions they would like it to answer, the more immediate question is how to actually get those answers without tools that let non-specialists ask their own questions--and without the infrastructure of tools needed to whip big-data sets into shape in the first place, according to Shalini Das, research director at the CIO Executive Board.
Big data technology and the knowledge base describing how to use it are both desperately immature and, in fact, are impeding each other's development.
Without knowing what questions to ask, how to ask them, or what data needs to be assembled to find the answers they need, business managers--who should be the ultimate consumers of big-data analytics--don't know where to start, according to White. And without specific questions to be answered, goals to be met, and well-defined benefits to be gained from effective analysis, Das explained, big data analytics software developers don't know which types of tools would be most valuable.
Effective tools and best practices will develop in tandem, forcing IT and business analysts to work together to set ground rules and to implement the actual technology, according to White.