How can you prepare for the big data era? Consider this expert advice from IT pros who have wrestled with the thorny problems, including data growth and unconventional data.
4 of 11
Better data compression saves on storage, and that's still important even as hardware costs per terabyte have declined. Column-store databases, such as HP Vertica, Infobright, ParAccel, and Sybase IQ, can achieve 30-to-1 or 40-to-1 compression while row-store databases, such as IBM DB2, Microsoft SQL Server, and MySQL, average 4-to-1 compression. That's because columnar data is consistent, containing all zip codes or all purchase order numbers, for example. Rows hold a mix of data, such as all the attributes associated with an individual customer--name, address, zip, purchase order number, and so on. The Aster Data and Oracle databases offer hybrid row/columnar features. Oracle's Hybrid Columnar Compression, for one, can crunch data at a 10-to-1 ratio.
Compression levels vary depending on the data, and keep in mind that column-store databases aren't always the best choice. If your queries call on many attributes, a row-store product may deliver better performance. Indeed, row-store databases are more commonly used for enterprise data warehouses handing a mix of queries whereas column-store databases more often power focused data marts. Column-store customers include digital-media measurement giant comScore, a Sybase IQ user since 1999, and fast-growing online network Interclick, which deployed ParAccel in 2009.