Software // Information Management
06:53 PM
Connect Directly

'Exaflop' Supercomputer Planning Begins

Backed by $7.4 million in funding, computer scientists aim to narrow the gap between theoretical peak performance and actual performance through new architectures.

Researchers at Sandia and Oak Ridge National Laboratories are preparing for the challenges of developing an exascale computer at the new Institute for Advanced Architectures.

Through the IAA, scientists plan to conduct the basic research required to create a computer capable of performing a million trillion calculations per second, otherwise known as an exaflop. That's a million times faster than today's teraflop computers and a thousand times faster than the petaflop barrier, which was broken in 2006.

Sandia's ASCI Red became the world's first teraflop computer in late 1996.

Backed by $7.4 million in funding, computer scientists aim to narrow the gap between theoretical peak performance and actual performance through new architectures.

"We're actually not building an exaflop supercomputer," said Sandia project lead Sudip Dosanjh. Rather, he said, the U.S. Department of Energy and the National Security Agency have made it clear that they expect to have need for exaflop computing around 2018. The anticipated applications, he said, include large-scale prediction, such as global climate change predictions, materials science analysis, fusion research, and national security problems that he could not discuss.

To meet those requirements, "there are a number of research challenges we need to get to work on," said Dosanjh. "We really need to do that in collaboration with industry and academia. We want to do R&D that will impact real systems in the next decade."

One such challenge is power consumption. "An exaflop supercomputer might need 100 megawatts of power, which is a significant portion of a power plant," said Dosanjh. "We need to do some research to get that down. Otherwise no one will be able to power one."

Then there's the issue of reliability, which tends to decline as the parts count increases. Given that an exascale computer might have a million hundred-core processors, Dosanjh speculated that such a machine might run for 10 minutes before suffering a failure. To manage a machine with so many parts, new fault-tolerance schemes need to be developed.

Data movement is also a critical concern, said Dosanjh. "The rate of memory access has not kept up with the ability of these processors to do floating point operations," he said.

And in addition to the hardware engineering challenges, programmers have to be educated to write code for such massively parallel systems. "As far as the industry is concerned, there needs to be an education effort as well to get people trained to write software at this scale," said Dosanjh.

Just such an effort is already under way. Last October, Google and IBM launched an educational initiative to teach programmers at several universities how to code for large-scale distributed computing systems.

The IAA had its initial meeting in January, attended by almost 50 representatives from government, academia, and industry. The topic of discussion was memory in high-performance computing. At the organization's next meeting, Dosanjh said researchers will discuss interconnects, the networks inside supercomputers.

Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
The Agile Archive
The Agile Archive
When it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
Top IT Trends to Watch in Financial Services
IT pros at banks, investment houses, insurance companies, and other financial services organizations are focused on a range of issues, from peer-to-peer lending to cybersecurity to performance, agility, and compliance. It all matters.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on for the week of July 17, 2016. We'll be talking with the editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.