New TPC Benchmark Poses Realistic Hurdles For Database Performance Results

The new benchmark requires the database to deal with several types of complex data. The document specifying the transaction steps and how they must be executed is 264 pages long.

Charles Babcock, Editor at Large, Cloud

March 20, 2007

4 Min Read

A newly available database benchmark, TPC-E, is intended to better measure database cost and performance. But the nonprofit Transaction Processing Performance Council, which created the test, says no auditors have been certified yet to verify results.

Mike Molloy, a senior manager at the Dell Performance Lab for Servers and chairman of the council, said getting auditors certified to review results will be the next step toward implementing the new measure.

Molloy said the council's members, which include leading database vendors Oracle, IBM, and Microsoft, participated in creating the benchmark. "The object is to move the state of the art forward" in measuring one system versus another, and many vendors can be expected to test their systems against the benchmark, he said in an interview.

TPC-E sought to close several loopholes that existed with its predecessor, TPC-C. That benchmark specified the steps of a transaction that needed to be executed by a system but left implementation up to each vendor. Numerous shortcuts that had never been envisioned by the benchmark's authors were found, such as searching for masses of user names that all begin with the same two letters or retrieving highly similar data from one section of the database rather than random data. The reported performance results were sometimes higher than any real-life user could anticipate receiving.

Molloy said the TPC-C benchmark was conceived as a straightforward warehouse retrieval transaction, where a store manager of a retail chain orders something from the warehouse. TPC-E is conceived as a stock trade through a brokerage house, with several complex steps needing to be completed.

Rather than leaving it to each vendor to write its own testing code, the council will supply C++ code that must be used with the benchmark. In addition, the council is supplying U.S. Census data as a source of real names and information. "There's more Smiths than Zehorians," noted Molloy, which gives the database a more realistic spread across the alphabet when called on to find customers. In addition, the benchmark will supply publicly available New York Stock Exchange trading data to also keep the test realistic. The new benchmark requires the database to deal with several types of complex data, including software objects, which need to be rebuilt after being broken down and stored in a relational database table. It also imposes the normal database task of executing checks on user-supplied constraints, such as a ZIP code must be a five-digit number. Under TPC-E, 22 such constraint checks must be performed.

The document specifying the transaction steps and how they must be executed is 264 pages long.

In addition, TPC-E requires looking up records based on four times as many primary keys as its predecessor, TPC-C. A primary key is a unique identifier of a column of data in a database table, such as an account number, and can be looked up in the database's index.

At the same time, TPC-E requires five times as many foreign key look-ups as TPC-C. A foreign key is a secondary piece of information in a column that is not indexed but can supply a relationship to another table. A customer's information may be indexed on an account number, but the customer's sales representative's number supplies a foreign key to an employee name with a history with the company.

"Most benchmarks mimic actions but don't use actual data. We use actual census and New York Stock Exchange data," said Molloy.

Vendors may still write stored procedures and other logic that speeds the internal operation of their systems, he said.

Despite being 15 years old, TPC-C is used throughout the industry. One recent TPC-C test looked at Oracle on HP's Intel-based servers and is reported here.

In addition to performance, TPC-C and TPC-E offer results indicating the cost of a system over a five-year period, sometimes expressed as a cost per transaction.

The council has 18 vendor members who pay $15,000 a year to maintain the council's work. In addition to Dell, IBM, Microsoft, and Oracle, other members include AMD, Bull, Fujitsu, Fujitsu-Siemens Computers, Hitachi, HP, Ingres, Intel, NEC, Netezza, Sun, Sybase, Teradata, and Unisys.

This story was modified on March 21 to correct the spelling of Mike Molloy's name.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights