The InformationWeek -- Blogs


Topics:  

  • Email this page E-mail this page
  • |  Print this page Print this page
  • |   Bookmark and Share

My Last Interview With Jim Gray


Posted by John Foley, Mar 29, 2007 04:14 PM

A few weeks before Microsoft researcher Jim Gray set sail on his ill-fated voyage from San Francisco Bay, we had an e-mail exchange. As a journalist who's covered Microsoft since the mid '90s, I got to know Microsoft's database genius as someone who was invariably helpful, and I contacted Jim for a story I was researching on Hewlett-Packard's move into the data warehousing market. As always, he knew the answers to my questions and offered more than I had asked him for. He also made an observation that, in light of his disappearance, now seems eerily prophetic.

I first met Jim about 12 years ago at a data warehousing conference in Boston. He had given a presentation in a half-empty ballroom on his idea that a supercomputer-like system could be created by harnessing hundreds or thousands of industry-standard PCs. He called the idea "cyberbricks," and he had joined Microsoft in 1995 to bring that vision to reality. At the time, Microsoft's SQL Server database on Windows NT didn't scale to the needs of large companies, and Jim was brought in to change that.

Not long after that conference, at a meeting of SQL Server users in Seattle, Jim tipped me off to his plan to load a terabyte of data onto SQL Server to show that it could be done. Back then, a terabyte was a huge amount of information and not readily available, so he used satellite images to meet his goal. That project came to be known as TerraServer, the predecessor to Microsoft's Virtual Earth. In May 1997, Jim joined Bill Gates on stage in New York at an event called Scalability Day to demonstrate the megadatabase that he and his colleagues had created.

Before coming to Microsoft, Jim worked at Digital Equipment Corp., Tandem Computers, and IBM Research. Knowing Jim had spent 10 years at Tandem, I thought of him immediately when Hewlett-Packard agreed to talk to InformationWeek about its plans to enter the data warehousing market using a refurbished Tandem NonStop database. The original database was partly Jim's work.

I e-mailed Jim on Thursday, Dec. 21, just two days before the extended holiday weekend. Within a few hours, he responded. "Great to hear from you," he began. "I am GLAD to help you any way I can…" It's the kind of reply I had grown to expect from Jim -- fast, friendly, detailed, and without the red tape.

My questions had to do with HP's plan to overhaul NonStop from a database originally developed for transaction processing into one for data warehousing: Could NonStop be tuned to work well for data analysis workloads? Was this a good idea on HP's part? How did SQL Server/Windows compare with NonStop for data warehousing?

"Yes, NonStop SQL was originally developed to run lots of transactions per second (1,000 eventually, but about 250 in its first incarnation). That number is laughable now, but at the time it was a multimillion dollar computer," Jim responded. "The folks at Tandem/HP have been cooking up a new system for the last decade."

Jim went on to describe some of what's different in the new NonStop database with its emphasis on SQL execution. "Not the low-level record manager called DB2 (disk process 2) that does single variable queries," he explained. "It is things like parallel sort, parallel join." Jim could always be counted on for a deep dive into the technical minutia.

Despite his obvious knowledge, Jim didn't pretend to know more than he did, pointing out that some of what he was relating was "rumor and hearsay." To facilitate fact checking, he copied Goetz Graefe, a former Microsoft colleague, on his e-mail to me. Goetz had recently gone to work for HP Labs and was a database expert, too.

A few weeks later, when the gravity of Jim's disappearance became clear, I e-mailed Goetz. "He has been a great positive force in the industry, in research, and also in my personal career," Goetz wrote, recalling that once, as a young college faculty member, he had taken someone else's advice over Jim's. "Much to my regret," Goetz added.

Jim Gray, through his openness and reliability, became one of my most respected and valued sources. He lived in a world of exponential numbers but had a knack for simplifying the outrageously complex world of algorithms and databases with millions of rows and tables. So I asked Jim a big-picture question: What's the next big problem to solve in data warehousing?

"The next big problem is making it easy," he wrote back. "Right now it takes far too much expertise to install and use the systems for data mining. The cost of hardware and software is near zero, and there are just not enough gurus to do all the things that can be done."

That last observation rings in my ears. "There are just not enough gurus to do all the things that can be done." They are the final words in Jim's e-mail to me, and I used them near the end of the HP story. As far as I know, it's the last comment Jim ever made to the media. The computer industry has lost one of its greatest gurus. And we are heartbroken over it.

« $1 Million Or Bust | Main | Yahoo And Microsoft Fight For Mobile Search While Google Pushes For The Entire Third Screen »



Sign up now for the weekly InformationWeek Blog Newsletter.


This is a public forum. United Business Media and its affiliates are not responsible for and do not control what is posted herein. United Business Media makes no warranties or guarantees concerning any advice dispensed by its staff members or readers.

Community standards in this comment area do not permit hate language, excessive profanity, or other patently offensive language. Please be aware that all information posted to this comment area becomes the property of United Business Media LLC and may be edited and republished in print or electronic format as outlined in United Business Media's Terms of Service.

Important Note: This comment area is NOT intended for commercial messages or solicitations of business.