Tradeoffs In Splitting DBMS Work Among MPP Nodes - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
9/9/2008
12:16 PM
Curt Monash
Curt Monash
Commentary
50%
50%

Tradeoffs In Splitting DBMS Work Among MPP Nodes

I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes...

I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes:

A boss node, whose jobs include: - Receiving and parsing queries - Optimizing queries, determining execution plans, and sending execution plans to the nodes - Receiving result sets and sending them back to the querier Worker nodes, which do their part of the query execution job and eventually ship data back to the headIn primitive forms of this architecture, there's a "fat head" that does altogether too much aggregation and query resolution. In more mature versions, data is shipped intelligently from worker nodes to their peers, reducing or eliminating "fat head" bottlenecks.

Exceptions to the base case include Vertica and Exasol. In their systems, all nodes run identical software. At the other extreme, some vendors use dedicated nodes for particular purposes. For example, Aster Data famously has special nodes for bulk data loading and export. Greenplum has a logical split between nodes that execute queries and nodes that talk to storage, and is considering offering the option of physically separating them in a future release.

The basic tradeoffs between these schemes go something like this:

• If there are more kinds of dedicated nodes, real-time load-balancing is harder; you're more likely to have idle capacity. • If there are more kinds of dedicated nodes, you can optimize hardware better, by using different kinds of hardware for different kinds of nodes. Potentially, this is a bigger factor if some kinds of nodes have dedicated disks attached and some don't.

Calpont, which hasn't actually shipped a DBMS yet, has an interesting twist. They're building a columnar DBMS in which the querying work is split between a kind of worker node, which does the query processing, and a storage node, which talks to disk. These nodes are not in any kind of one-to-one correspondence; any worker node can talk with any storage node. Calpont believes that in the future some of the storage node logic can migrate into storage systems themselves, in almost a Netezza-like strategy, but on more standard equipment.

The Calpont story may actually make more sense in a shared-disk storage-area-network implementation than for a fully shared-nothing MPP, but that's a subject for a different post.I talk with lots of vendors of MPP data warehouse DBMS. I've now heard enough different approaches to MPP architecture that I think it might be interesting to contrast some of the alternatives. The base-case MPP DBMS architecture is one in which there are two kinds of nodes...

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Commentary
Will AI and Machine Learning Break Cloud Architectures?
Lisa Morgan, Freelance Writer,  6/10/2019
Slideshows
9 Steps Toward Ethical AI
Cynthia Harvey, Freelance Journalist, InformationWeek,  5/15/2019
Commentary
Humans' Fascination with Artificial General Intelligence
Guest Commentary, Guest Commentary,  6/6/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
A New World of IT Management in 2019
This IT Trend Report highlights how several years of developments in technology and business strategies have led to a subsequent wave of changes in the role of an IT organization, how CIOs and other IT leaders approach management, in addition to the jobs of many IT professionals up and down the org chart.
Slideshows
Flash Poll