Big Data // Big Data Analytics
11:22 AM
Seth Grimes
Seth Grimes
Connect Directly
Repost This

Big Data: Avoid 'Wanna V' Confusion

The three V's -- volume, velocity and variety -- do a fine job of defining big data. Don't be misled by the "wanna-V's:" variability, veracity, validity and value.

IBM sees veracity as a fourth big data V. (Like me, IBM doesn't advocate variability, validity, or value as big data essentials.) Regarding veracity, IBM asks, "How can you act upon information if you don't trust it?"

Yet facts, whether captured in natural language or in a structured database, are not always true. False or outdated data may nonetheless be useful, non-factual subjective data (feelings and opinions) too.

Consider two statements, one asserting a fact and the other containing one that is no longer true. Join me in concluding that data may contain value unlinked from veracity:

-- "The Iraqi regime... possesses and produces chemical and biological weapons." -- George W. Bush, October 7, 2002.

-- "I am glad that George Bush is President." -- Daniel Pinchbeck, writing ironically, June, 2003.

Veracity does matter. I'll cite an old Russian proverb: "Trust, but verify." That is, analyze your data -- evaluate it in context, taking into account provenance -- in order to understand it and use it appropriately.

3 V's Versus 'Wanna-V's'

My aim here is to differentiate the essence of big data, as defined by Doug Laney's original-and-still-valid 3 V's, from the derived qualities of new Vs proposed by various vendors, pundits and gurus. My hope is to maintain clarity and stave off market-confusing fragmentation begotten by the wanna-V's.

On one side of the divide we have data capture and storage; on the other, business-goal oriented filtering, analysis and presentation. Databases and data streaming technologies answer the big data need; for the balance, the smart stuff, you need big data analytics.

Variability, veracity, validity and value aren't intrinsic, definitional big data properties. They are not absolutes. By contrast, they reflect the uses you intend for your data. They relate to your particular business needs.

You discover context-dependent variability, veracity, validity and value in your data via analyses that assess and reduce data and present insights in forms that facilitate business decision-making. This function -- analytics -- is the key to understanding big data.

Seth Grimes is the leading industry analyst covering text analytics and sentiment analysis. He founded Washington-based Alta Plana Corporation technology strategy consultancy, in 1997.

Items from pills to power plants will soon generate billions of data points. How will this movement change your industry? Also in the new, all-digital Here Comes The Internet Of Things issue of InformationWeek: How IT can capitalize on the NSA's big data prowess. (Free registration required.)

2 of 2
Comment  | 
Print  | 
More Insights
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.