Vertica Gets Flexible, Taps MapReduce
Upgrade automates best-possible storage and query approaches. MapReduce connection lets Hadoop be Hadoop.
Column-store innovation, MapReduce integration and an information lifecycle management (ILM) roadmap are the high points of yesterday's announcement of the Vertica Analytic Database 3.5. The upgrade, which is slated for general release this fall, promises performance gains and broader applicability, but on the ILM front, Vertica is following rather than leading the competition.
The centerpiece of the Vertica 3.5 release is a new FlexStore capability designed to automatically apply flexible database design, storage and query execution techniques. The idea is to keep the database management system optimized for particular analytic workloads. For instance, when FlexStore detects that there are either a small number of rows or a small number of columns, it can group data together to reduce input/output (I/O) workloads and thereby improve efficiency.
"If you have a dimension table with, say, ten columns of data, why arbitrarily split that up into ten different reads off disk if you could do it all in one read?" asks David Menninger, Vertica's vice president of marketing and product management. "By providing a more flexible organization of the column-store information, you enhance performance and increase the speed of the applications."
The FlexStore feature groups data automatically and dynamically, so if the number of columns or rows in a table grows, the organization changes to optimize I/O. With beta testing of the feature just beginning, Menninger says he can't provide performance-increase estimates.
FlexStore can also detect and place "hot" data that is accessed frequently on the fastest-performing disks and partitions within disks on the storage array -- a capability that has long been offered by both Teradata and Netezza. Since FlexStore recognizes the difference between hot and warm data -- and with planned development, cold data -- the feature gives Vertica a start on information lifecycle management (ILM). FlexStore will "eventually" be able to move data to speed- and cost-appropriate storage technologies ranging from solid state drives to near-line storage, Menninger says, but he couldn't predict when the functionality will be available. ILM support was released last month by Vertica competitor Sybase in its Sybase IQ 15.1 release.
The Vertica 3.5 release will also integrate MapReduce, but not by supporting the massively parallel processing approach within the database itself, as Greenplum and Aster Data Systems have done. Rather, Vertica will connect to separate instances of the Hadoop framework, the open-source version of MapReduce. Thus, users will be able to bring data from Vertica into Hadoop for analysis and then store result sets back to Vertica.
"MapReduce uses a lot of bandwidth, and by keeping it separate, it won't affect the performance of the analytic database," says Gartner Analyst Donald Feinberg. "It's an alternative that makes sense for those who already have MapReduce clusters running or who want to buy Hadoop processing through [cloud-based provider] Cloudera,"
The downside, of course, is that you have to have to support two separate environments, but Menninger says "our customers tell us they don't want MapReduce on the same computers."
Vertica 3.5 was announced at this week's TDWI World Conference in San Diego, where attendees were already chewing over the details of major announcements from Netezza and IBM and upgrades by Teradata and Sybase (see "Better, Faster, Cheaper: Summer '09 Data Warehousing Roundup").
About the Author
You May Also Like