Lie 2: The data we have is good
We'll bet money that most respondents' data sets are inaccurate, incomplete, and/or misaligned with one another. Do you really have a single source of truth? Do different groups slice data in different ways? Are you making decisions based on inaccurate or incomplete data?
Case in point: 19% of companies in our survey use geolocation as part of their analysis strategy, pulling information from smart devices and Web visitors to understand behavior. However, Web location tracking is notoriously inaccurate when it comes to enterprise and institutional traffic. That's because most companies and government agencies work in private clouds with a limited number of egress points. If you're using Web location data to track the success of your sales and marketing programs by region, you're likely basing decisions on bad information. That big block of traffic from Boston may actually come from an enterprise with offices in the Midwest.
Who's checking data quality? Just one in four respondents identified a dedicated business analyst group as one of the top two users of data within their company. It's simply amazing how many reports and graphs we see without sampling or accuracy notes. For example, almost every company does customer surveys, yet very few indicate confidence levels or bias results. Got 25,000 customers? Your customer service survey should have 1,843 respondents if you want a 99% confidence level with a plus or minus 3% margin of error. Furthermore, results should be biased by revenue level. Reality is, we just don't see that done with any type of data.
Lie 3: Everything will be OK if we can just get more tools
A quarter of respondents plan to use more big data tools over the next 12 months. Now, we like Hadoop, NoSQL, Splunk, and the plethora of other big data options out there, but we recommend looking at what data sets are sitting idle before cutting a check. Given the low levels of use of the 20 internal and external data sets we asked about, it's clear the problem is related more to staffing than systems.
Unfortunately, fewer respondents plan to invest in big data staff versus spending money on technology. Only 33% plan to grow their training and development programs; 9% are cutting back. Net new hiring ranked at the bottom of our list, with 17% growing staffing levels compared with 14% cutting.
Nowhere is this "tools over people" focus more evident than in healthcare. The federal government's electronic records incentives have driven the industry to a new level of data collection and reporting. But now that healthcare providers have all this data, they're trying to figure out how to use it. "There is big money to be made in healthcare big data, so everyone and their brother was throwing up solutions," says Bill Gillis, CIO of Beth Israel Deaconess Physician Organization. But it's important to work with people who understand healthcare organizations and the complexity of their data, Gillis says. "The business need should drive the process," he says. "The tool alone will not change much. Finding a skilled hand that can effectively wield that tool will."
Lie 4: There's an expertise shortage
Speaking of staff, an oft-quoted McKinsey & Co. study estimates a shortfall of 140,000 to 190,000 people in "big data staffing" by 2018. Our own InformationWeek Staffing Survey shows that 18% of respondents focused on big data want to increase staff in this area by more than 30% in the next two years, but 53% say it will be difficult to find people with the right skills.
Roll the clock back a few years and substitute the words "virtualization engineer" or "Cobol programmer" or even "webmaster" for "big data specialist" and you'll similarly find people predicting doom. Don't get sucked in again. You already have much of this talent within your organization; you just need to set it free. Consider that 39% of respondent organizations have department-level analysts as the main users of their information. Break those people out of their department silos and move them toward a more holistic view of data.
For example, a U.S. retailer we worked with always had separate data teams aligned with various departments. The strongest team was within the catalog group, banging out circ plans, catalog yields, conversion rates, even profit per page. Great stuff, but that team was limited to using catalog and financial data. Siloed from the Web team, they were missing the transformation happening within the customer base. Separate departments, separate views of the truth.
Don't blame the analysts. The company built the structure, and IT wasn't involved enough to identify the information gaps between departments. Our point is, IT not only has to understand the data itself, but it must also become an integral part of identifying and growing the centralized talent pool. That means putting more emphasis on training and talent development. Competitors will steal some of your more talented big data pros if you don't give them a reason to stick around.
We scanned the online listings of the major hiring sites and found that the salary levels for data analysts still range from about $55,000 to low six figures. However, if you add "big data" to the standard titles, the average salary doubles. Expect everyone's LinkedIn titles to change in the next 12 months.