Healthcare // Analytics
News
8/23/2011
12:03 PM
Doug Henschen
Doug Henschen
Slideshows
Connect Directly
Google+
LinkedIn
Twitter
RSS
E-Mail
50%
50%

10 Lessons Learned By Big Data Pioneers

How can you prepare for the big data era? Consider this expert advice from IT pros who have wrestled with the thorny problems, including data growth and unconventional data.
Previous
5 of 11
Next


Just as consistent columnar data aids compression, you can improve compression optimization by sorting data before loading. comScore uses Syncsort DMExpress software to sort data alphanumerically before it's loaded into Sybase IQ. Where 10 bytes of unsorted data can be compressed to three or four bytes, says Michael Brown, comScore's chief technology officer, pictured above, 10 bytes of sorted data can typically be crunched down to one byte. "That makes a huge difference in the volume of data we have to store," Brown says.

Sorting also can streamline processing. comScore sorts URL data to minimize Web site taxonomy lookups. Instead of loading the 40 URLs for Web site pages in the order they were visited during a session, sorting might reveal that 20 of those pages were on Facebook, 12 were on GMail and the balance were at NYTimes.com. The sorted data would trigger just three site lookups whereas unsorted data might trigger many redundant lookups if the visitor bounced back and forth among just a few sites. "That saves a lot of CPU time and a lot of effort," Brown says. It's possible to sort data with SQL statements, and custom scripts, but sorting is also a common feature in data-integration software from IBM, Informatica, Oracle, SAP, SAS, Syncsort, and others. At truly large scale, Hadoop is an option for sorting and other processing steps.

RECOMMENDED READING

Big Data A Big Backup Challenge

Big Data: Informatica Tackles The High-Velocity Problem

IBM Picks Hadoop To Analyze Large Data Volumes

Hadoop Big Data Startup Spins Out Of Yahoo

2 Ways Big-Data Analysis Pays Off

A Model For The Big Data Era

Machines Are Driving The Big-Data Era

NOAA CIO Tackles Big Data

Previous
5 of 11
Next
Comment  | 
Print  | 
More Insights
Big Love for Big Data? The Remedy for Healthcare Quality Improvements
Big Love for Big Data? The Remedy for Healthcare Quality Improvements
Healthcare data is nothing new, but yet, why do healthcare improvements from quantifiable data seem almost rare today? Healthcare administrators have a wealth of data accessible to them but aren't sure how much of that data is usable or even correct.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest, Dec. 9, 2014
Apps will make or break the tablet as a work device, but don't shortchange critical factors related to hardware, security, peripherals, and integration.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.