Earlier this week I read here a post called "DBMS Past, Present and Future". I thought It would be appropriate to (re)introduce an alternate future (which is already happening) to RDBMS use. The post below is actually a repost from something I wrote last year in my old DDJ blog i.e. pre dobbs code talk (with appologies to those of you who already read it back then).
The title I used then was - The RDBMS Is Dead
Okay now, that I have your attention -- RDBMSisn't dead yet, but we can see a whole class of applications (maybe a couple of classes) where the importance of the RDBMS as we know it today is greatly diminished.
In an article Iposted recently on InfoQ, (which I also mentioned in thepost on eBay architecturelast week), I discussed the notion of database denormalization on Internet-scale sites (such as Amazon, eBay, Flickr, etc.). One point of denormalization is immutable data where there isn't a lot of gain in normalization to begin with.
The other thing is entity representation vs. speed. The problem is that joins are slow and sometimes you get to corners where if we want any type of decent speed we need to denormalize.Todd Hoff notesthat as well:
The problem is joins are relatively slow, especially over very large data sets, and if they are slow your website is slow. It takes a long time to get all those separate bits of information off disk and put them all together again. Flickr decided to denormalize because it took 13 Selects to each Insert, Delete or Update.
This point is, however, that these "corner cases" get more and more prevalent even in smaller scale application -- especially when you have complex entities (as is the case with defense systems, for example). Mats Helander recently wrote apost about saving to Blob, and only adding fields as needed for indexing and identity purposes. Mats also suggests the semi-transparent way of using XML columns where the database can do something with the otherwise opaque data.
This point, in fact, demonstrates that the relational data future is indeed not totally secure as we do see that that leading databases begin to treat XML data (which is hierarchical and not relational) as a native citizen -- to the point we can even index XML data.
So far we've seen a trend to denormalize more, handle non-relational data, what else? Ah, transactions.
I've worked on several systems where the data was constantly updated and actually gave the system's representation of the world outside (of the system) the focus was on availability and latency. Which is again also aligned with the approach taken by the large Internet sites which emphasis eventual consistency over immediate consistency.
In distributed systems, crashes happen. The RDBMS is show-stopper when it comes to crashes -- if we can't commit, we need to stop, roll back. Now maybe we can start-over. Is this acceptable? There are many scenarios where it is not. I've seen it in defense systems, in communications systems, and even in e-commerce systems ("if you are not responsive, I'll just go to the competition").
What do you do in the presence of error? Joe Armstrong suggest the following as the basis forErlangin his thesis:
To make a fault-tolerant software system which behaves reasonably in the presence of software errors we proceed as follows:
1. We organize the software into a hierarchy of tasks that the system has to perform. Each task corresponds to the achievement of a number of goals. The software for a given task has to try and achieve the goals associated with the task. Tasks are ordered by complexity. The top level task is the most complex, when all the goals in the top level task can be achieved then the system should function perfectly. Lower level tasks should still allow the system to function in an acceptable manner, though it may offer a reduced level of service.The goals of a lower level task should be easier to achieve than the goals of a higher level task in the
2. We try to perform the top level task.
3. If an error is detected when trying to achieve a goal, we make an attempt to correct the error. If we cannot correct the error we immediately abort the current task and start performing a simpler task.
On top of that we try to keep any update local, i.e. within a task boundary on the hardware where the task occurred -- distributing the transactions is not a good option. I outlined why when I talked aboutSOA and cross-services transactionsbut the reasoning holds.
Well, truth be said the RDBMS is not dead, its demise probably not even around the corner. Also this does not mean that there aren't any uses for a database. But that's true for other architectural choices. Whoever said that a single tier solution is not the right one for very specific types of system....
RDBMS succeeded to become the de-facto standard to building system because they offer some very compelling attributes -- ACID brings a lot of piece of mind. Large-scale systems, low-latency systems, and fault-tolerant systems opt for another set of compelling attributes (BASE). The point is that when you design your next solution maybe the conventional database thinking is something that you should at least give another thought to and instead of just following dogma.
How Enterprises Are Attacking the IT Security EnterpriseTo learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.
IT Strategies to Conquer the CloudChances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.