When a task involves sifting through petabytes or more of streaming data to find insights, anomalies, or other kernels of actionable information, software can do what humans can't. But can software also lessen the dependence on human experts who guide big data strategies?
This question matters because a projected shortage of data scientists -- individuals with a mélange of tech-oriented skills, including business analyst, machine learning expert, and data engineer -- may hinder big data's potential within a few years. On the other hand, these fears may be overblown, particularly if technological advances render the data scientist less essential, if still vital, to organizations' data-driven plans.
Executives from Red Lambda, a big data security and analytics startup based in Longwood, Fla., say that increasingly sophisticated software tools are simplifying the complexities of big data and making it more accessible to mainstream business users.
[For more on making big data useful to more people, see Analytics For All, No Data Scientists Needed.]
In a phone interview with InformationWeek, Red Lambda founder and executive chairman Bahram Yusefzadeh and CEO Iain Kerr commented on how the role of the data scientist is evolving as streaming analytics software, such as Red Lambda's own offering designed to find security threats in vast streams of moving data, takes hold in organizations.
Detecting anomalies such as security threats, for instance, requires near-instantaneous analysis.
"In the world that Red Lambda functions in, we don't need a data scientist for solving the security problem and identifying anomalous behavior," said Yusefzadeh. "We don't have time for a data scientist to come and analyze, as this data is in motion. You're not going to find anybody who can do that."
Solving security problems, he added, is analogous to finding a needle in a haystack -- but in very short time.
"The bad guys are so smart, and are doing so much so quickly, that you don't have time for a data scientist to come in," Yusefzadeh added.
The notion that ongoing refinements will help make complex technologies accessible to mainstream users is nothing new, of course. But given the apparent shortage of data scientists needed to make today's complex big data software run smoothly, enhancements in analytics tools are particularly welcome.
"We're not saying you don't need data scientists in your organization," said Kerr. "There are many reasons why these folks exist."
But, he added, sophisticated software deployed effectively, such as Red Lambda's security apps that highlight anomalies, can enable business users with dashboard-style visualization tools to pinpoint potential threats very quickly.
Not every organization uses big data for near-real-time analysis, of course, and data scientists play an important role in many areas, including regulatory compliance and forensic analysis.
Red Lambda, for instance, has worked with a social media company that wanted to "slice and dice" its big data for purposes of selling the information, bringing in advertising, and myriad other uses that weren't particularly time-sensitive, unlike, say, security threats.
In addition, the concept of the "hired gun" data scientist is misleading, said Yusefzadeh, as it's difficult, if not impossible, for an outsider to parachute into a business and immediately understand its strategy and the relevance of its data.
"There's no way you're going to be flying in data analysts... [who] look at your data and tell you what you want to do," Yusefzadeh said. Rather, people who are "already inside your company," those on the executive, strategy, and marketing teams, will provide the real actionable insights.
Considering how prevalent third-party attacks are, we need to ask hard questions about how partners and suppliers are safeguarding systems and data. In the Partners' Role In Perimeter Security report, we'll discuss concrete strategies such as setting standards that third-party providers must meet to keep your business, conducting in-depth risk assessments -- and ensuring that your network has controls in place to protect data in case these defenses fail. (Free registration required.)