09:46 AM
Core System Testing: How to Achieve Success
Oct 06, 2016
Property and Casualty Insurers have been investing in modernizing their core systems to provide fl ...Read More>>

CERN Project Will Collect Hundreds Of Petabytes Of Data

Near the Franco-Swiss border west of Geneva, Switzerland, CERN, the European Organization for Nuclear Research, is constructing a particle accelerator that scientists hope will give them new insights into the structure of matter. When the Large Hadron Collider begins operating in 2006, it will generate between 5 and 20 petabytes of raw data each year, for a total of hundreds of petabytes of data in the collider's projected lifetime of 10 to 15 years.

CERN is designing a data warehouse to store all that information. The organization has created a prototype system with several hundred terabytes of simulation and test data stored in an object database from Objectivity Inc. That database is expected to reach 1 petabyte by 2004, says Jamie Shiers, a database group leader at CERN.

The organization is still mulling over many details about the collider database. What it does know is that the system will run on Intel servers. The test system has about 1,000 dual-processor servers incorporating IA-32 microprocessors, soon to be upgraded to IA-64 microprocessors, running Linux. "Our budget is very tight," Shiers says when asked about the reason for using the low-cost, open-source operating system.

The CERN development team is considering using the Oracle9i database for the data warehouse. The Objectivity software is something of a standard among physics labs: The Stanford Linear Accelerator Center at Stanford University also uses Objectivity. But CERN is considering Oracle for support reasons, since Oracle has extensive European operations and Objectivity is far away in Mountain View, Calif., Shiers says.

Early tests using Oracle's Real Application Clusters clustering technology have had "encouraging results," Shiers says, although CERN hasn't decided on what clustering technology to use.

CERN IT staffers also are debating how much data to store on tape and how much on disk. One possible approach is to keep a month's worth of data on disk for quick access and archive the rest of it on tape. Shiers says that decision will hinge on balancing the cost against data-access patterns.

Close this window

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.