Topics:
My Last Interview With Jim Gray
A few weeks before Microsoft researcher Jim Gray set sail on his ill-fated voyage from San Francisco Bay, we had an e-mail exchange. As a journalist who's covered Microsoft since the mid '90s, I got to know Microsoft's database genius as someone who was invariably helpful, and I contacted Jim for a story I was researching on Hewlett-Packard's move into the data warehousing market. As always, he knew the answers to my questions and offered more than I had asked him for. He also made an observation that, in light of his disappearance, now seems eerily prophetic. I first met Jim about 12 years ago at a data warehousing conference in Boston. He had given a presentation in a half-empty ballroom on his idea that a supercomputer-like system could be created by harnessing hundreds or thousands of industry-standard PCs. He called the idea "cyberbricks," and he had joined Microsoft in 1995 to bring that vision to reality. At the time, Microsoft's SQL Server database on Windows NT didn't scale to the needs of large companies, and Jim was brought in to change that. Not long after that conference, at a meeting of SQL Server users in Seattle, Jim tipped me off to his plan to load a terabyte of data onto SQL Server to show that it could be done. Back then, a terabyte was a huge amount of information and not readily available, so he used satellite images to meet his goal. That project came to be known as TerraServer, the predecessor to Microsoft's Virtual Earth. In May 1997, Jim joined Bill Gates on stage in New York at an event called Scalability Day to demonstrate the megadatabase that he and his colleagues had created. Before coming to Microsoft, Jim worked at Digital Equipment Corp., Tandem Computers, and IBM Research. Knowing Jim had spent 10 years at Tandem, I thought of him immediately when Hewlett-Packard agreed to talk to InformationWeek about its plans to enter the data warehousing market using a refurbished Tandem NonStop database. The original database was partly Jim's work. I e-mailed Jim on Thursday, Dec. 21, just two days before the extended holiday weekend. Within a few hours, he responded. "Great to hear from you," he began. "I am GLAD to help you any way I can…" It's the kind of reply I had grown to expect from Jim -- fast, friendly, detailed, and without the red tape. My questions had to do with HP's plan to overhaul NonStop from a database originally developed for transaction processing into one for data warehousing: Could NonStop be tuned to work well for data analysis workloads? Was this a good idea on HP's part? How did SQL Server/Windows compare with NonStop for data warehousing? "Yes, NonStop SQL was originally developed to run lots of transactions per second (1,000 eventually, but about 250 in its first incarnation). That number is laughable now, but at the time it was a multimillion dollar computer," Jim responded. "The folks at Tandem/HP have been cooking up a new system for the last decade." Jim went on to describe some of what's different in the new NonStop database with its emphasis on SQL execution. "Not the low-level record manager called DB2 (disk process 2) that does single variable queries," he explained. "It is things like parallel sort, parallel join." Jim could always be counted on for a deep dive into the technical minutia. Despite his obvious knowledge, Jim didn't pretend to know more than he did, pointing out that some of what he was relating was "rumor and hearsay." To facilitate fact checking, he copied Goetz Graefe, a former Microsoft colleague, on his e-mail to me. Goetz had recently gone to work for HP Labs and was a database expert, too. A few weeks later, when the gravity of Jim's disappearance became clear, I e-mailed Goetz. "He has been a great positive force in the industry, in research, and also in my personal career," Goetz wrote, recalling that once, as a young college faculty member, he had taken someone else's advice over Jim's. "Much to my regret," Goetz added. Jim Gray, through his openness and reliability, became one of my most respected and valued sources. He lived in a world of exponential numbers but had a knack for simplifying the outrageously complex world of algorithms and databases with millions of rows and tables. So I asked Jim a big-picture question: What's the next big problem to solve in data warehousing? "The next big problem is making it easy," he wrote back. "Right now it takes far too much expertise to install and use the systems for data mining. The cost of hardware and software is near zero, and there are just not enough gurus to do all the things that can be done." That last observation rings in my ears. "There are just not enough gurus to do all the things that can be done." They are the final words in Jim's e-mail to me, and I used them near the end of the HP story. As far as I know, it's the last comment Jim ever made to the media. The computer industry has lost one of its greatest gurus. And we are heartbroken over it. « $1 Million Or Bust | Main | Yahoo And Microsoft Fight For Mobile Search While Google Pushes For The Entire Third Screen » |
| Sign up now for the weekly InformationWeek Blog Newsletter. |