CERN's Hadron Collider Research Fueled By OpenStack
CERN works with Rackspace to upgrade cloud infrastructure to OpenStack Grizzly in time for reactivation of its Hadron Collider, being used to pursue Higgs Boson particle.
New York's 32-Story Data 'Fortress'
(click image for slideshow)
CERN, the birthplace of the World Wide Web, is rebuilding its Large Hadron Collider and re-architecting its data center infrastructure on OpenStack Grizzly as it continues its pursuit of the Higgs Boson particle and other advanced physics.
At the end of the process, it will be able to collect twice as much data from a research experiment in the collider as before, and that data will be uploaded to a "federated" Grizzly OpenStack cloud. Grizzly is the name of the OpenStack project's seventh and latest release, which came out in April. CERN's federated cloud will encompass 15,000 servers in two locations, Budapest, Hungary, and Geneva, Switzerland.
The collider is being rewired and rebuilt with stronger magnets to run at twice the power level at which it ran before, and that means, says Tim Bell, CERN infrastructure manager, it will generate twice as much data as it did when it was taken offline earlier this year, after a breakthrough in particle physics on March 14.
When the collider was shut down, it had succeeded in producing a collision between two proton streams, each powered by 3.5 trillion electron volts. Each such experiment yields two petabytes of data, with collectors running the data through filters to yield 20 GBs per second. The Grizzly OpenStack cloud will be collecting "at least twice that amount of data. It's quite difficult to record data at that rate," Bell said in an interview prior to Monday's announcement. Both the collider and the OpenStack clouds supporting it are scheduled to renew operation at the start of 2015.
To build its federated OpenStack Cloud, CERN has signed an agreement with Rackspace, the San Antonio, Texas-based founder of the OpenStack project with NASA. RackSpace consultants and engineers will contribute their expertise through the CERN OpenLab, which provides a way for commercial companies to donate staff and implement technology at a scale that the CERN project demands.
Bell, a member of the OpenStack Foundation board, said OpenStack is natural fit with CERN, which has an abiding interest in open source projects. It spotted OpenStack as code "with a vibrant community" and "commercial support." Its OpenLab is meant to draw in commercial suppliers in a way that's acceptable to the 20 European nations that support the facility.
On its website, CERN describes OpenLab as a private/public partnership: "What makes the partnership work is that CERN is a blue-sky laboratory that uses very real world techniques to unlock the mysteries of nature. And we are perhaps the most demanding customer of them all."
Bell said CERN will be able to manage the large increase in compute resources better through the OpenStack cloud without increasing staff. The conversion will also allow scientists to provision and launch their experiments with less intervention by IT staff.
John Engates, CTO of Rackspace, said in an interview that Rackspace was happy to take on the task of helping design and build the federated OpenStack cloud at CERN because it constituted such a credible proving ground for both the code and Rackspace deployment methods, given the worldwide attention received by its experiments.
At the same time, there may be an additional reason. With CERN running a recent version of OpenStack, it will be positioned to turn to Rackspace for additional public cloud service if its own facilities become overtaxed. Rackspace operates a data center in United Kingdom, which Engates said was close enough to support CERN requirements.
Bell pointed out that despite the fact that CERN will increase its servers from about 7,500 to 15,000 in its federated cloud, its services are periodically overtaxed as CERN's 11,000 physicist and scientific users around the world gather for their annual meeting. "As you approach that conference, the workload increases dramatically. Everybody tries to get in one last data analysis," said Bell. With the Eurozone extremely cost conscious about budget deficits, "we don't want to add staff or over provision. That is the point that we would want to overflow into the public cloud."
As it is, OpenLab now has the task of re-purposing existing CERN servers or deploying new ones into cloud clusters at the rate of 200 a week. A small staff of four is responsible for doing so, but most of the 200-member CERN IT staff is involved in the transition one way or the other, Bell said. Implementing open source code "is very much part of the culture of our organization," and so far it's going smoothly.
Twenty years, CERN staffer Tim Berners-Lee gave the world a content display and hypertext transfer system that became known as the World Wide Web. Now CERN is attempting to accelerate its findings in physics. "We're homing in on the Higgs boson particle," noted Bell, with other scientists wanting to do experiments on matter/anti-matter and dark energy. "There's no shortage of problems at all. We want probe the places we couldn't probe before," he noted.
When protons collide in an accelerator, they produce unstable particles that swiftly decay into more recognizable protons and electrons. Proving the existence of the intermediate particles can either support or detract from established theories of quantum physics, one of which is that a Higgs boson particle will be created in a high speed collision. Unidentified particles suspected of being Higgs boson were detected July 4, 2012, as a result of a Hadron Collider experiment, with the particles created again and studied as a result of a collision staged March 14 of this year. The collider has previously been used to create antihydrogen in 2010 and maintain its existence for 15 minutes in 2011.
And in addition to learning about physics, Engates says, Rackspace will learn about running clouds at scale with peak rates of data transfer for thousands of users seeking access. "We can benefit from their work," he adds.
Multicloud Infrastructure & Application ManagementEnterprise cloud adoption has evolved to the point where hybrid public/private cloud designs and use of multiple providers is common. Who among us has mastered provisioning resources in different clouds; allocating the right resources to each application; assigning applications to the "best" cloud provider based on performance or reliability requirements.
InformationWeek Tech Digest, Nov. 10, 2014Just 30% of respondents to our new survey say their companies are very or extremely effective at identifying critical data and analyzing it to make decisions, down from 42% in 2013. What gives?