Big Data // Big Data Analytics
Commentary
8/7/2013
11:22 AM
Seth Grimes
Seth Grimes
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Big Data: Avoid 'Wanna V' Confusion

The three V's -- volume, velocity and variety -- do a fine job of defining big data. Don't be misled by the "wanna-V's:" variability, veracity, validity and value.

IBM sees veracity as a fourth big data V. (Like me, IBM doesn't advocate variability, validity, or value as big data essentials.) Regarding veracity, IBM asks, "How can you act upon information if you don't trust it?"

Yet facts, whether captured in natural language or in a structured database, are not always true. False or outdated data may nonetheless be useful, non-factual subjective data (feelings and opinions) too.

Consider two statements, one asserting a fact and the other containing one that is no longer true. Join me in concluding that data may contain value unlinked from veracity:

-- "The Iraqi regime... possesses and produces chemical and biological weapons." -- George W. Bush, October 7, 2002.

-- "I am glad that George Bush is President." -- Daniel Pinchbeck, writing ironically, June, 2003.

Veracity does matter. I'll cite an old Russian proverb: "Trust, but verify." That is, analyze your data -- evaluate it in context, taking into account provenance -- in order to understand it and use it appropriately.

3 V's Versus 'Wanna-V's'

My aim here is to differentiate the essence of big data, as defined by Doug Laney's original-and-still-valid 3 V's, from the derived qualities of new Vs proposed by various vendors, pundits and gurus. My hope is to maintain clarity and stave off market-confusing fragmentation begotten by the wanna-V's.

On one side of the divide we have data capture and storage; on the other, business-goal oriented filtering, analysis and presentation. Databases and data streaming technologies answer the big data need; for the balance, the smart stuff, you need big data analytics.

Variability, veracity, validity and value aren't intrinsic, definitional big data properties. They are not absolutes. By contrast, they reflect the uses you intend for your data. They relate to your particular business needs.

You discover context-dependent variability, veracity, validity and value in your data via analyses that assess and reduce data and present insights in forms that facilitate business decision-making. This function -- analytics -- is the key to understanding big data.

Seth Grimes is the leading industry analyst covering text analytics and sentiment analysis. He founded Washington-based Alta Plana Corporation technology strategy consultancy, in 1997.

Items from pills to power plants will soon generate billions of data points. How will this movement change your industry? Also in the new, all-digital Here Comes The Internet Of Things issue of InformationWeek: How IT can capitalize on the NSA's big data prowess. (Free registration required.)

Previous
2 of 2
Next
Comment  | 
Print  | 
More Insights
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest September 24, 2014
Start improving branch office support by tapping public and private cloud resources to boost performance, increase worker productivity, and cut costs.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.