Cloud-based data science services aim to fill the gap for companies lacking in-house business intelligence or analytics expertise.
5 Big Wishes For Big Data Deployments
(click image for larger view and for slideshow)
Business intelligence and analytics providers are responding to the continuing shortage of data scientists by offering data science knowhow as a cloud service. These companies, both existing providers and startups, have embraced the cloud in recent years.
But can businesses desperate to plumb large, complex datasets find answers to vital business questions this way? Opinions differ.
At Teradata, customers frequently seek out the company's consulting services in the course of purchasing one of its massively scalable database systems for a data warehouse or analytic application.
"Our differentiation is the knowledge of how to get algorithms implemented in a Teradata environment," said Bill Franks, chief analytics officer at Teradata, in a phone interview with InformationWeek. Achieving this kind of platform-specific optimization at scale generally is beyond "boutique" database consultants, he said.
Franks also suggested that boutiques too narrowly focused on one specialty, such as analytics for retailer Web traffic data, could be caught flat-footed as companies take these analytics in-house or need to mix and match different types of data.
Mike Hoskins, CTO of Actian, a provider of highly parallelized software, believes the world has entered a new period, which will be more disruptive and bigger than the shift to relational databases some 30 years ago.
"Google and Facebook are data companies, not software companies," he told InformationWeek in a phone call. "They collect data, process it and rent it back to the world with some kind of high-value extensions."
These early adopters -- massive collectors of data who drive ever more value out of the data they capture -- represent what will be common in every industry over time, Hoskins predicted.
Actian bolstered its cloud capabilities with the $162 million February purchase of Pervasive Software, which had been pushing its data-integration software into the cloud computing and big data markets through its Data Cloud and DataRush platforms. In April, it acquired ParAccel, a company that offers the massively parallel processing (MPP) database of the same name.
For the near term, Hoskins promotes Actian's own approach for enabling machine learning in traditional applications by using plug-in libraries that can be called by SQL programs. The company currently claims 500 to 600 of these advanced plug-ins for SQL.
Meanwhile, one startup sees an opportunity to do a better job with local data, using new kinds of tools to extract insights from the wealth of information contained in business systems, including file systems.
"We believe data should be analyzed where it lives, in its place," said John Joseph, president and co-founder of DataGravity. The year-old, privately funded startup, which expects to release its first product in 2014, is focusing initially on providing meta information about unstructured corporate data, such as statistics about its history, ownership and alteration over time.
DataGravity's forthcoming tool will empower IT administrators at small and midsize businesses (SMBs), elevating them from "being a container manager to giving intelligence around stored data," Joseph told InformationWeek in a phone interview.
Joseph perceives a widening chasm between large companies, able to hire or grow data scientists, and SMBs -- DataGravity's target audience -- who are falling behind.
Long term, analytics is so strategic, companies will need to fold it into their design, planning and strategy, said Teradata's Franks.
Outsourcing the execution of these blueprints may still be necessary, but the long-term play is to "take ownership," Franks said, noting that Teradata pushes knowledge transfer -- the code, an explanatory report and any recommendations -- when it creates a plan for a customer.
"In more cases than not, [the customer] has some analytic talent in-house to take that transfer," he said.
But Actian's Hoskins sees a different future, and so isn't frantic about the data science gap.
While some scenarios will weigh against cloud-based analytics (issues like available data-transfer speeds, governance, security and privacy), Hoskins is optimistic that highly iterative machine learning will do an increasingly better job of predicting the future.
"People are skeptical of [outsourcing analytics] but I'd say they're unfairly skeptical," Hoskins said, remembering similar early assessments of cloud-based data-integration services.
"Linear regression as a service, while immature, is very promising," he said, adding he was "bullish" about analytics as a consumable service, delivered in the cloud.
Making decisions based on flashy macro trends while ignoring "little data" fundamentals is a recipe for failure. Also in the new, all-digital Blinded By Big Data issue of InformationWeek: How Coke Bottling's CIO manages mobile strategy. (Free registration required.)
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.