File Systems That Fly - InformationWeek
Hardware & Infrastructure
04:50 PM

File Systems That Fly

The superfast input-output speeds of cluster file systems could change the way companies approach storage

The NCSA's archive server grows by 40 to 60 terabytes per month because of data from National Science Foundation jobs being run on its computers. As recently as the late '90s, computer scientists learned how to use memory in their programs to avoid reading and writing data to disk. "Today that's no longer taught," Butler says. "The practices of computer scientists have changed so much."

Traditionally, there have been a couple of ways to make storage systems scale up. Highly standardized network-attached storage systems use popular protocols for file sharing on a LAN, such as Microsoft's CIFS or the Network File System that's a standard on Unix and Linux systems. Those let users attach many computers to a server and share a virtual disk that lives on the network. NAS uses inexpensive Ethernet connections between computers but transmits data at a relatively slow 1 Gbps for most applications. That causes logjams, since the speed of talking to local disk drives is faster than communicating with those on the network.

Storage area networks deliver higher speeds than NAS, getting up to 2 to 4 Gbps of data, but they need expensive Fibre Channel switches and require boards for each computer that can cost as much as $1,000 each. The iSCSI protocol, which lets disks and computers in a SAN talk directly over Ethernet, is gaining popularity in shared storage networks as well.

The fast-disk communication speeds of cluster file systems could appeal to companies in industries that run highly I/O-dependent software, including banking, oil exploration, microchip manufacturing, automaking, aerospace, and Hollywood computer animation.


Increases in the speed of shuttling data between computers and disks in a cluster haven't kept pace with advances in microprocessor and memory speeds, compromising application performance

An emerging class of file-system software for clusters speeds up input-output operations, potentially changing the way companies approach storage

Cluster file systems already are in use at universities, national labs, and supercomputing research centers and stand to make inroads into general business computing in the near future

The small companies selling cluster file systems are starting to land some marquee customers. Cluster File Systems Inc., owner of the intellectual property associated with the Lustre software, counts Chevron Corp. among its customers. Startup Panasas Inc. sells a cluster file system called ActiveScale on its own hardware used by customers including Walt Disney Co. Larger vendors also are exploring the technology: IBM is considering adding object-based storage to its GPFS file system, according to Illuminata. Dell, meanwhile, has a deal with Ibrix to sell Ibrix's Fusion file system.

Demand for servers running Ibrix technology is causing Dell to re-examine how it bundles networking with computing. Dell historically has figured on about a gigabyte per second of bandwidth for every teraflop of computing power it sells a customer, says Victor Mashayekhi, a senior manager in Dell's scalable systems group. "Over time, we'll see that ratio increase," he says. "You'll see computing nodes become hungrier. Driving that is the amount of data that needs to be consumed."

Large data requirements compelled the Texas Advanced Computing Center at the University of Texas in Austin to bring in Ibrix's Fusion. The center, which provides computing to the university's researchers, uses the file system to speed up the performance of a computational fluid-dynamics application used by about 1,700 scientists that simulates turbulence for aerodynamics. Each process in the program needs to write a file that's between 300 and 400 Mbytes in size. Using NFS, writing all the data required--about 20 Gbytes--took about 50 minutes, says Tommy Minyard, the center's high-performance computing group manager. That needed to occur every hour. As a result, for each hour the application ran, it only computed for just 10 minutes. With Ibrix, the center's I/O write time for that program has shrunk to five minutes per hour. "It's benefiting all users of the system," Minyard says.

That's the good news. On the other hand, standards for cluster file systems are just emerging, and the software is owned by small companies without a proven track record. The systems also are limited in the types of computers on which they can run. Lustre is available only for Linux and Catamount, an obscure operating system from supercomputer maker Cray Inc. It's also hard to install and not well documented. That's causing some business-technology managers to wait and see.

"Lustre looks interesting, but we haven't deployed anything," says Andy Hendrickson, head of production technology at DreamWorks Animation SKG. "The big question is, does it do what it says and is it reliable enough to base production on? We have hard deadlines." And, perhaps summarizing the concerns for many companies considering cluster file systems, Hendrickson asks, "How much would we have to change our process?" Whether the benefits of this emerging technology outweigh the risks for businesses remains to be seen.

2 of 2
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
How Enterprises Are Attacking the IT Security Enterprise
How Enterprises Are Attacking the IT Security Enterprise
To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
Register for InformationWeek Newsletters
White Papers
Current Issue
2017 State of the Cloud Report
As the use of public cloud becomes a given, IT leaders must navigate the transition and advocate for management tools or architectures that allow them to realize the benefits they seek. Download this report to explore the issues and how to best leverage the cloud moving forward.
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join us for a roundup of the top stories on for the week of November 6, 2016. We'll be talking with the editors and correspondents who brought you the top stories of the week to get the "story behind the story."
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Flash Poll