New TPC Benchmark Poses Realistic Hurdles For Database Performance Results
The new benchmark requires the database to deal with several types of complex data. The document specifying the transaction steps and how they must be executed is 264 pages long.
A newly available database benchmark, TPC-E, is intended to better measure database cost and performance. But the nonprofit Transaction Processing Performance Council, which created the test, says no auditors have been certified yet to verify results.
Mike Molloy, a senior manager at the Dell Performance Lab for Servers and chairman of the council, said getting auditors certified to review results will be the next step toward implementing the new measure.
Molloy said the council's members, which include leading database vendors Oracle, IBM, and Microsoft, participated in creating the benchmark. "The object is to move the state of the art forward" in measuring one system versus another, and many vendors can be expected to test their systems against the benchmark, he said in an interview.
TPC-E sought to close several loopholes that existed with its predecessor, TPC-C. That benchmark specified the steps of a transaction that needed to be executed by a system but left implementation up to each vendor. Numerous shortcuts that had never been envisioned by the benchmark's authors were found, such as searching for masses of user names that all begin with the same two letters or retrieving highly similar data from one section of the database rather than random data. The reported performance results were sometimes higher than any real-life user could anticipate receiving.
Molloy said the TPC-C benchmark was conceived as a straightforward warehouse retrieval transaction, where a store manager of a retail chain orders something from the warehouse. TPC-E is conceived as a stock trade through a brokerage house, with several complex steps needing to be completed.
Rather than leaving it to each vendor to write its own testing code, the council will supply C++ code that must be used with the benchmark. In addition, the council is supplying U.S. Census data as a source of real names and information. "There's more Smiths than Zehorians," noted Molloy, which gives the database a more realistic spread across the alphabet when called on to find customers. In addition, the benchmark will supply publicly available New York Stock Exchange trading data to also keep the test realistic. The new benchmark requires the database to deal with several types of complex data, including software objects, which need to be rebuilt after being broken down and stored in a relational database table. It also imposes the normal database task of executing checks on user-supplied constraints, such as a ZIP code must be a five-digit number. Under TPC-E, 22 such constraint checks must be performed.
The document specifying the transaction steps and how they must be executed is 264 pages long.
In addition, TPC-E requires looking up records based on four times as many primary keys as its predecessor, TPC-C. A primary key is a unique identifier of a column of data in a database table, such as an account number, and can be looked up in the database's index.
At the same time, TPC-E requires five times as many foreign key look-ups as TPC-C. A foreign key is a secondary piece of information in a column that is not indexed but can supply a relationship to another table. A customer's information may be indexed on an account number, but the customer's sales representative's number supplies a foreign key to an employee name with a history with the company.
"Most benchmarks mimic actions but don't use actual data. We use actual census and New York Stock Exchange data," said Molloy.
Vendors may still write stored procedures and other logic that speeds the internal operation of their systems, he said.
Despite being 15 years old, TPC-C is used throughout the industry. One recent TPC-C test looked at Oracle on HP's Intel-based servers and is reported here.
In addition to performance, TPC-C and TPC-E offer results indicating the cost of a system over a five-year period, sometimes expressed as a cost per transaction.
The council has 18 vendor members who pay $15,000 a year to maintain the council's work. In addition to Dell, IBM, Microsoft, and Oracle, other members include AMD, Bull, Fujitsu, Fujitsu-Siemens Computers, Hitachi, HP, Ingres, Intel, NEC, Netezza, Sun, Sybase, Teradata, and Unisys.
This story was modified on March 21 to correct the spelling of Mike Molloy's name.
About the Author
You May Also Like
2024 InformationWeek US IT Salary Report
Aug 15, 20242024 InformationWeek US IT Salary Report
May 29, 20242022 State of ITOps and SecOps
Jun 21, 2022