The Fall of the Relational EmpireThe Fall of the Relational Empire

Relational is given a lot of credit because of its staying power and incumbency, which is often confused with universal usefulness. But if you step back and think about it, there is nothing special about a relational database, and in the world as we see it evolving, the physical structure, and even location of data, no longer matters...

April 28, 2008

There has been a lot written about the suitability of relational databases in the ever-expanding Web world of text and pictures and video, even in Rajan Chandras' latest blog. Relational is given a lot of credit because of its staying power and incumbency, which is often confused with universal usefulness. But if you step back and think about it, there is nothing special about a relational database and in the world as we see it evolving, the physical structure, and even location of data, no longer matters. What made relational special was not the database, it was SQL itself.But now SQL is the problem. While it is infinitely useful for dealing with structured data, it has no facility at all for the other 95% of data. In fact, it is pretty poor at analytics too, and we've been jumping through hoops trying to get it to do analytical work for a long time. Only a couple of vendors have figured it out and they've invested thousands of person years in their engines. For example, 25 years ago I had APL, a more expressive and compact language to work with. Here is an example from Wikipedia (try this in SQL): Give me all the prime numbers from 1 to R. In APL it is: (∼R∈R⌂.*R)/R←1↓ι).

That's it; that's the whole program. For doubters, there is no prime number operator, just drop, assignment, cross product multiplication (Boolean) and membership. It takes a while to understand APL, but in essence, you blow everything up and then shake and sift the result until the answer falls out and you do it in extremely cryptic and compact notation.

There were other tools too. I'd challenge any SQL engine for analytics to do half as much analytical processing as, say, Express circa 1995. I'm not suggesting that we make BI pervasive by teaching everyone APL and Express. If you think the top of the BI pyramid is small now, you would need a scanning electron microscope to find it with those languages. The point is, SQL is only so miraculous within its own domain of structured data. For the rest of the universe, we need something else.

In the end, how data is stored and managed should be unbundled from how it is addressed by other applications and processes. That was the beauty of relational databases because of SQL. But SQL is conceptually based on tuples and keys and atomicity, which do not translate to the world we live in today. In 1985, I was consulting to a large New York corporation, working for the Treasurer and building predictive and stochastic models for, among other things, capital expenditures based on risk analysis. Even though I was using weird languages like APL, I was co-located on the floor with some of the mainstream IT management group and I will never forget how one of them, in the most condescending manner possible said, "I'll bet you don't even know what Third Normal Form is." He was right. That was a warning for what was coming.Relational is given a lot of credit because of its staying power and incumbency, which is often confused with universal usefulness. But if you step back and think about it, there is nothing special about a relational database, and in the world as we see it evolving, the physical structure, and even location of data, no longer matters...