Welcome Guest. | Log In| Register | Membership Benefits
  • Email this page E-mail this page
  • |  Print Print this page
  • |   Bookmark and Share
  • icon

CERN Project Will Collect Hundreds Of Petabytes Of Data




Near the Franco-Swiss border west of Geneva, Switzerland, CERN, the European Organization for Nuclear Research, is constructing a particle accelerator that scientists hope will give them new insights into the structure of matter. When the Large Hadron Collider begins operating in 2006, it will generate between 5 and 20 petabytes of raw data each year, for a total of hundreds of petabytes of data in the collider's projected lifetime of 10 to 15 years.

CERN is designing a data warehouse to store all that information. The organization has created a prototype system with several hundred terabytes of simulation and test data stored in an object database from Objectivity Inc. That database is expected to reach 1 petabyte by 2004, says Jamie Shiers, a database group leader at CERN.

More Hardware Insights

White Papers

Reports

Videos


Intel CEO Paul Otellini demonstrates and discusses the future of collaboration and talks about Intels business model, including how it approaches R&D. Huge media files flying around the Internet deal with a great deal of network contention, but Aspera promises a high speed file transfer mechanism that it claims can get the full capacity of the pipe, regardless of underlying network congestion. Perry Wu, CEO and co-founder of BitGravity, describes meeting the challenge of delivering Internet video at desktop speeds. Previous content delivery networks were optimized for text and images, but streaming video presents new bandwidth problems.
Intel CEO Paul Otellini demonstrates and discusses the future of collaboration and talks about Intels business model, including how it approaches R&D.
The organization is still mulling over many details about the collider database. What it does know is that the system will run on Intel servers. The test system has about 1,000 dual-processor servers incorporating IA-32 microprocessors, soon to be upgraded to IA-64 microprocessors, running Linux. "Our budget is very tight," Shiers says when asked about the reason for using the low-cost, open-source operating system.

The CERN development team is considering using the Oracle9i database for the data warehouse. The Objectivity software is something of a standard among physics labs: The Stanford Linear Accelerator Center at Stanford University also uses Objectivity. But CERN is considering Oracle for support reasons, since Oracle has extensive European operations and Objectivity is far away in Mountain View, Calif., Shiers says.

Early tests using Oracle's Real Application Clusters clustering technology have had "encouraging results," Shiers says, although CERN hasn't decided on what clustering technology to use.

CERN IT staffers also are debating how much data to store on tape and how much on disk. One possible approach is to keep a month's worth of data on disk for quick access and archive the rest of it on tape. Shiers says that decision will hinge on balancing the cost against data-access patterns.

Close this window



Subscribe to RSS


Advertisement


CAREER CENTER
Ready to take that job and shove it?



TechCareers

SEARCH
Function:

Keyword(s):

State:
SPONSOR
RECENT JOB POSTINGS
CAREER NEWS
Go beyond Google and get vertical. These specialized search sites will help you find the business information you need -- fast.

Ari Balogh was named to the post of chief technology officer as the companys for a "realignment" of employees.





Subscription Info
Apply for a free 52-week subscription to InformationWeek (a $199 value)

Last Name:

First Name:

Title:

Company Name:

City:

Business Address:

Zip:

State:

Email Address:

NOTE: Offer valid for U.S., U.S. possessions, & Canada only

            

Join economist Chris Cornell and 3 CIOs in an Exclusive Online Exchange for Senior IT Executives: Using IT to Drive Value in a Turbulent Economy. November 5th only.