File Systems That Fly

The superfast input-output speeds of cluster file systems could change the way companies approach storage

Aaron Ricadela, Contributor

June 17, 2005

5 Min Read

The NCSA's archive server grows by 40 to 60 terabytes per month because of data from National Science Foundation jobs being run on its computers. As recently as the late '90s, computer scientists learned how to use memory in their programs to avoid reading and writing data to disk. "Today that's no longer taught," Butler says. "The practices of computer scientists have changed so much."

Traditionally, there have been a couple of ways to make storage systems scale up. Highly standardized network-attached storage systems use popular protocols for file sharing on a LAN, such as Microsoft's CIFS or the Network File System that's a standard on Unix and Linux systems. Those let users attach many computers to a server and share a virtual disk that lives on the network. NAS uses inexpensive Ethernet connections between computers but transmits data at a relatively slow 1 Gbps for most applications. That causes logjams, since the speed of talking to local disk drives is faster than communicating with those on the network.

Storage area networks deliver higher speeds than NAS, getting up to 2 to 4 Gbps of data, but they need expensive Fibre Channel switches and require boards for each computer that can cost as much as $1,000 each. The iSCSI protocol, which lets disks and computers in a SAN talk directly over Ethernet, is gaining popularity in shared storage networks as well.

The fast-disk communication speeds of cluster file systems could appeal to companies in industries that run highly I/O-dependent software, including banking, oil exploration, microchip manufacturing, automaking, aerospace, and Hollywood computer animation.

The small companies selling cluster file systems are starting to land some marquee customers. Cluster File Systems Inc., owner of the intellectual property associated with the Lustre software, counts Chevron Corp. among its customers. Startup Panasas Inc. sells a cluster file system called ActiveScale on its own hardware used by customers including Walt Disney Co. Larger vendors also are exploring the technology: IBM is considering adding object-based storage to its GPFS file system, according to Illuminata. Dell, meanwhile, has a deal with Ibrix to sell Ibrix's Fusion file system.

Demand for servers running Ibrix technology is causing Dell to re-examine how it bundles networking with computing. Dell historically has figured on about a gigabyte per second of bandwidth for every teraflop of computing power it sells a customer, says Victor Mashayekhi, a senior manager in Dell's scalable systems group. "Over time, we'll see that ratio increase," he says. "You'll see computing nodes become hungrier. Driving that is the amount of data that needs to be consumed."

Large data requirements compelled the Texas Advanced Computing Center at the University of Texas in Austin to bring in Ibrix's Fusion. The center, which provides computing to the university's researchers, uses the file system to speed up the performance of a computational fluid-dynamics application used by about 1,700 scientists that simulates turbulence for aerodynamics. Each process in the program needs to write a file that's between 300 and 400 Mbytes in size. Using NFS, writing all the data required--about 20 Gbytes--took about 50 minutes, says Tommy Minyard, the center's high-performance computing group manager. That needed to occur every hour. As a result, for each hour the application ran, it only computed for just 10 minutes. With Ibrix, the center's I/O write time for that program has shrunk to five minutes per hour. "It's benefiting all users of the system," Minyard says.

That's the good news. On the other hand, standards for cluster file systems are just emerging, and the software is owned by small companies without a proven track record. The systems also are limited in the types of computers on which they can run. Lustre is available only for Linux and Catamount, an obscure operating system from supercomputer maker Cray Inc. It's also hard to install and not well documented. That's causing some business-technology managers to wait and see.

"Lustre looks interesting, but we haven't deployed anything," says Andy Hendrickson, head of production technology at DreamWorks Animation SKG. "The big question is, does it do what it says and is it reliable enough to base production on? We have hard deadlines." And, perhaps summarizing the concerns for many companies considering cluster file systems, Hendrickson asks, "How much would we have to change our process?" Whether the benefits of this emerging technology outweigh the risks for businesses remains to be seen.

About the Author(s)

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights