Celko on SQL: Auxiliary Tables vs. Declarative Coding
Auxiliary tables are not only portable across SQL environments and adaptable to external software, they're often a faster, better choice than SQL with computations. And as chips and hardware get faster and cheaper, the auxiliary table approach will make even more sense.
For years I have been pushing the use of various kinds of auxiliary tables for solving SQL queries. An auxiliary table is one that holds static (or relatively static) data, such as an enterprise calendar, a sequence or a function that is tricky to compute.
One reason I like using auxiliary tables is that they'll make even more sense over the next five years as the computing world changes in several important ways:
1. 64-bit hardware will become the standard off-the-shelf chip architecture. The hardware keeps getting faster with every release; Moore's Law has held up pretty well over the years in that regard.
2. We'll have cheaper and faster secondary storage. The cost per megabyte is going down faster than Moore's Law would have predicted, and solid-state memory is getting cheap enough that when I speak at a conference these days, I get a memory stick instead of a coffee cup as a thank you knick knack. Perhaps the whole idea of primary and secondary storage will disappear and be replaced with just online storage -- anything else being archival.
3. Multi-core processors will become the standard off-the-shelf chip architecture -- and I mean tens of processors, not just the dual- and quad-core chips we have today. Years ago, Jerry Pournelle predicted processors would become so cheap that you would allocate one processor per task in your systems. This trend means parallelism is going to be done at the hardware level. The set-oriented model of SQL makes it ideal for parallelism, and massive parallelism in the hardware means that JOINs can be faster than a stream of computations.
So what do these trends have to do with auxiliary tables? The key point is that auxiliary tables are portable across all SQL products (you only need a basic JOIN operation), and they also can be used by other software. For example, you can load a look-up table into a spreadsheet, a COBOL program or pretty much any other language or tool. You need only rewrite proprietary-syntax procedures, triggers and functions into the target SQL dialect or target language.
There are situations today where the auxiliary table approach is faster than SQL with computations, as this month's puzzle blog clearly demonstrates. There are situations where the computational approach is faster. There are situations where the choice of auxiliary table-versus-computations all depends on the size of the problem, statistical data distributions, choice of SQL products and a ton of other things. Life is often like that.
Here are Celko's heuristics on when to use and when not to use auxiliary tables:
1. Don't use auxiliary tables when the domain is too big (whatever "too big" means this week with new technology)
2. Don't use auxiliary tables when the domain is dynamic
About the Author
You May Also Like