informa
/
Feature

Before He Disappeared: Conversation With Microsoft's Jim Gray

The database giant hasn't been seen since setting out from San Francisco in his sailboat Jan. 28. In his honor, we reprise a Q & A he had with Dr. Dobb's Journal in 2001.

DDJ: What is interesting is that the operating system that actually came out of Bell labs in the early '70s and late '60s was absolutely not object oriented in any way.

JG: Yes. In fact, Unix was a reaction to Multix, which Ken Thompson had worked on. It was an attempt to make an extremely system. It had to live in a very restricted environment. C++ was a derivative of BCPL. Simila 67 was also in the same algorithm genealogy. It was much more an experiment in what a good programming language ought to be as opposed to what a good system's programming language ought to be.

DDJ: I'm sorry. I sidetracked you.

JG: That's fine. At any rate, I had many interests and one of them grew out of the simulation work. In about 1971 or 1970, I went to work at IBM Research in New York and, subsequently, in San Jose. There was a lot of work, at that point, in databases going on. IBM was pushing, in fact, this idea of hierarchical data models. There was a small group of people who (inspired largely by Ted Codd) that believed that the hierarchical data model was just too hard. I guess everybody has an epiphany story -- A story in which they finally got it. For me the epiphany experience was to actually write a program in IMS. I just couldn't believe how hard it was. In particular, we did various studies while we were looking at these issues. By and large, for every DL1(?) statement, every IMS statement, it took 17 Debug runs in order to figure out whether it worked or not.

DDJ: Is that because the data was structured in graphs and they were possibly ... the way they put the locks?

JG: The actual problem, as far as I believe is that the interface was so arcane and the error cases were so a mess. That is to say that if you just step back as Charlie Bachman did and the DBTG guys did, and just thought about it abstractly, and you just drew things on the whiteboard, it was cool. It was fine. It was simple. It wasn't any problem. The actual implementations had this nightmare behavior thatyou could... There were so-called currency indicators associated with every set. You had a currency indicator. In IMS you had a position. If you moved to the cursor, and everything went fine, great. If you moved the cursor, and anything went wrong, you had no idea what state you were in. You're operating in this graph world where, in fact, the rules about what happens when a record gets deleted, and what happens when you get a "not found". What happens in all of these unusual cases was either not specified or was bad behavior. It wasn't simple. The top level, at the whiteboard level, I think IMS was okay. I think DBTG was okay. At the level which you actually had to operate in order to write programs, it was a nightmare.

DDJ: In your career, have you encountered other examples where implementation of some theory appears very ideal on the whiteboard, but turns out to be completely impractical?

JG: I have certainly seen people fall in the performance trap. You come up with an idea and it looks like a good idea, but nobody can actually figure out how to make it work with reasonable performance.

DDJ: Is that something that separates computer science from mathematics, for example.

JG: Right. It is the case that many people think that the relational model's great strength is its mathematical simplicity. I think that simplicity allowed people to nail down the corner cases. To say, "This is what you do when this happens." We have added a few features to SQL like "null values", which complicate every decision that you make about the relational database. People should have really thought, when they added nulls, whether this was worth the complication because, to this day, it is possible for one SQL guru to hand another SQL guru a SQL statement and say, "Tell me what this means"? It is a trick question. The person will go through the analysis and say, "Oh, it means this." They constantly forget that null values are in the picture. To restate it, I think the reason that relational systems caught on, and certainly the reason that I got enthusiastic about them, is that they had a simplicity which meant that they were actually going to be able to be used by people.