Are Careless Linux Users Giving Ext4 A Bad Rap?

There is a new Linux file system in town, and some people see it as a menace. But the real problem may have more to do with users who ignore simple, common-sense IT practices.
There is a new Linux file system in town, and some people see it as a menace. But the real problem may have more to do with users who ignore simple, common-sense IT practices.In the early 1990s, Linux got its very own file system. Since then, the extended file system (ext) has evolved constantly; perhaps the single biggest innovation came in the late 1990s, when ext3 added journaling capabilities to its predecessor, ext2.

Today, while ext2 remains is use, ext3 is the gold standard for Linux file systems. In fact, developers took nearly a decade to get a successor to ext3 ready for mainstream use, finally removing the "experimental" designation from ext4 late last year.

Since then, ext4 has appeared in several major Linux distro updates. Ubuntu 9.04, for example, offers it as an install-time option but still uses ext3 by default; Fedora 11, which is due out next month, will reportedly use ext4 as its default file system option. Other major distros, including OpenSUSE, will also offer at least optional ext4 support.

Without a doubt, ext4 adds quite a few interesting features. I won't go into all of them here, although if you're interested in the gory details, Linux Magazine has an excellent overview of the new file system. In general, however, ext4 focuses on improving Linux file system reliability and performance -- especially for massive storage devices using large, potentially very complex, directory structures.

Also, since ext4 uses checksumming to protect the integrity of its journal data, it should provide an additional safeguard against lost or corrupt data.

Yet data corruption issues are exactly where, according to some users, ext4 fails to deliver. After reports of massive data loss on pre-release versions of Ubuntu Linux 9.04 started to surface, follow-up articles in some major online IT publications traced the problem to a feature in ext4 called delayed allocation.

Delayed allocation can improve disk performance by organizing and writing data far more efficiently. In ext4, however, this process can create gaps when data is extremely vulnerable to a system crash or power loss. Even the file system's journaling feature cannot recover data lost during this window of vulnerability, which can last up to 60 seconds in some situations.

It's no mystery why Linux users working with ext4 are unhappy when they encounter this issue. Before you label ext4 a problem child, however, let's consider a few important points.

First, keep in mind that delayed allocation isn't a bug -- it's a feature. Other modern file systems, including ZFS (used in Solaris) and HFS+ (Mac OS X), also employ delayed allocation techniques.

Next, consider the fact that some Linux users -- to put it bluntly -- were begging for trouble long before they tried ext4. In many cases, crashes that cause data loss are due to power failures on systems running without UPS protection. In many others, users failed to protect critical data with regular backups.

Both of these problems are unforgivable, and they have nothing to do with ext4 or any other Linux feature. UPS hardware is cheap, easy to use, and readily available. Besides allowing users to shut down a system cleanly during a power failure, they protect systems against crashes due to voltage spikes or brownouts. Spending thousands on a new system and then failing to protect this investment with a $100 UPS is just plain dumb.

Common-sense precautions won't make using ext4 a risk-free experience. Many Linux applications, for example, do not yet incorporate changes that could reduce the chances of data loss on a system running ext4. And developers are also still working on a configuration option for ext4 that would further minimize data-loss risks.

So, ext4 is still very much a work in progress, and for many businesses, the costs of adopting it will outweigh the benefits. Still, it's impossible to manage risk without understanding what causes it in the first place. And clearly, some Linux users are using ext4 as a scapegoat when their own sloppy IT practices finally turn around and bite them.

Editor's Choice
Brian T. Horowitz, Contributing Reporter
Samuel Greengard, Contributing Reporter
Nathan Eddy, Freelance Writer
Brandon Taylor, Digital Editorial Program Manager
Jessica Davis, Senior Editor
Cynthia Harvey, Freelance Journalist, InformationWeek
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing