informa
/
3 MIN READ
Commentary

Drowning in Data

My partner Nancy had been having chronic, hard-to-diagnose problems with her computer and I was getting tired of looking over her shoulder or bumping her out of her chair to troubleshoot her machine.

My partner Nancy had been having chronic, hard-to-diagnose problems with her computer and I was getting tired of looking over her shoulder or bumping her out of her chair to troubleshoot her machine.

 

At the same time, we were about a month away from buying me a new portable, at which time Nancy would inherit my desktop computer and her computer would move across the driveway to the house for me to use when I needed to get away from the distractions of the office and get some work done. So I thought, why not accelerate the process and switch computers now? Then I could troubleshoot Nancy's computer's problems on my desk and she could have a new(er) computer right away.

 

I created a user account for me on Nancy's machine and for her on my machine, and schlepped files over to these new accounts via our file server. That way if I missed moving something over, Nancy could always bump me off her old machine and log in as herself and still have her familiar setup and all her files. And eventually I'd get rid of the redundant files and, finally, the old user accounts.

None of that was particularly smart, and it got worse when the portable arrived and I set it up.

I'm pretty fanatical about filing. I name files to indicate the project and date they are part of, I create folders/directories for all projects and parts thereof, and keep all my research materials there. But I was doing this computer swap in free moments while trying to meet deadlines and deal with the usual daily emergencies, so the upshot is that I now have nice clearly labeled folders of nice clearly labeled files on my portable, in my account on either of two desktop machines, and on the file server. Many are duplicates, and all I need to do is reconcile these. One of these days.

 

It's a mess.

 

So I guess the moral of this story is, never hire me to manage your files.

But another moral is, even when you're just managing megabytes of data, things can get out of control quickly. 

 

For dealing with petabytes, you need an entirely different approach.

Who's dealing with that much data? Soon, maybe all of us. When I checked the result of a recent Slashdot poll at 55832 votes, the most popular answer to the question, "How much storage will you be using ten years from now?" was 100TB - 1 PB, closely followed by "Depends... how much you got?"

 

At some point the file systems we've used for so long won't measure up to the demands of all this data. Apple is building read and write support for Sun's 128-bit ZFS file system into its Snow Leopard release of OS X Server. ZFS once stood for Zettabyte File System, but now it's an orphan acronym, like IBM. IDG (which I think is still a regular acronym) predicts that digital data will be being generated at a rate of one zettabyte (1000 exabytes, 1,000,000 petabytes) per year by 2010.

 

In the shorter term, there are some people confronting the challenges of massive storage right now. There will be a competition at the SC08 supercomputing conference (November 15-21 in Austin, TX) showcasing approaches to making the best use of storage in high performance computing. (www.sc08.com) 

 

But in the meantime, I think I just have to roll up my sleeves and clean up my filing mess.


Editor's Choice
Sara Peters, Editor-in-Chief, InformationWeek / Network Computing
John Edwards, Technology Journalist & Author
John Edwards, Technology Journalist & Author
James M. Connolly, Contributing Editor and Writer