Software // Information Management
Commentary
3/24/2010
01:45 PM
Seth Grimes
Seth Grimes
Commentary
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Is It Time For NoETL?

I've been bemused by NoSQL, the movement that propounds database-management diversity. Is it now similarly time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?

I've been bemused by NoSQL, the movement that propounds database-management diversity with the very valid claim that a one-size-fits-all relational approach is a poor match for emerging, demanding data challenges. Didn't we all know that relational databases, based on tables and joins, aren't always best? Hadn't the issue been the lack of usable, reliable, enterprise worthy alternatives? Similarly, haven't we long understood that wiring-up extract-transform-load (ETL) is laborious -- all those adapters and rules and the need for hand-matching -- even if necessary given the perceived need to gather, cleanse, and integrate diverse BI data sources? Is that preparatory data work still essential? Or is it now time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?NoSQL

The "SQL" of NoSQL is Structured Query Language, which has been closely associated with relational databases since the '70s, since the RDBMS early days. With Oracle's and IBM's support, SQL vanquished superior alternatives such as Ingres's Quel. SQL is an easy target for criticism, on its own and standing in as a proxy for relational systems.

SQL is stateless, given which limitation, vendors have wrapped it in diverse, incompatible procedural languages to support multi-step data processes. SQL's set-oriented approach creates a data-handling burden for application programmers so we have cursors, a row-/record-oriented retrieval kludge. Correlated subqueries are a usability nightmare, and the check-list demand for ACID compliance -- transactional atomicity, consistency, isolation, durability -- is simply overhead overkill for analytical applications.

NoSQL is a catch-all term for a grab bag of relational alternatives. NoSQL is a New Testament that seeks to supplant the Codd of Old.

SQL's deficiencies have been known for years; nonetheless, SQL has served the database community well and supported the creation of immense business value for the many, many millions of RDBMS end users. So have the ETL technologies that feed relational (and other) databases from flat-file, spreadsheet, operational-system, and database sources -- technologies, plural. Is ETL still relevant in a world of semantic computing?

Semantic computing

Semantic computing relies on meaning-ful data. That data may be stored in RDBMS tables with an associated metadata repository. It may be modeled with a graph structure, described via RDF (the XML based Resource Description Framework), and captured in a "triple store" for query via SPARQL. It may be mapped into an ontology, a mechanism for knowledge representation. ("Knowledge" here is a network of relationships, a.k.a. facts, that link entities within a subject-matter domain.)

Semantic computing involves methods and software designed to mine meaning, relationships, and usages from sources both conventional and unconventional, from structured databases and from the chaos that is the Web. All that good stuff is inferred from whatever definitions, data profiles (i.e., information on the distributions of the values of variables), and context are available.

The payoff is that you have all the ingredients necessary to support dynamic integration, to enable as-you-like-it data mashability.

Dynamic integration: NoETL

A number of tools claim/aim to support dynamic integration, some metadata or semantics driven, so that are essentially visually programmed without reliance, for the end user or behind the scenes, on the ETL equivalent of SQL. They include companies such as Expressor, Progress Software, and JackBe, the latter an enterprise mashup vendor.

I'll credit JackBe with prompting me to think much more intently about this stuff than I would have otherwise. I wrote a short paper for them, Nimble Intelligence: Enterprise BI Mashup Best Practices, and presented on the same topic in a JackBe webinar yesterday. (I was paid for this work and for strategy consulting.) The thought is that mashups bring agility to BI, the possibility of integrating the data and application elements you need, when needed, without much or most of the overhead typically associated with conventional BI.

It's freedom baby, yeah!

NoETL is an extension of this concept, actually a sort-of retake on Enterprise Information Integration (EII), a once-promising but now neglected notion that one can successfully build and query a unified virtual schema, spanning data sources, without requiring data collection into a single data warehouse or repository. In considering NoETL, let's recognize the value of traditional ETL and of EII and use them where they fit best. Let's also understand the promise and power of semantics, and of the diversity of NoSQL-ite data representations, in seeking data integration approaches that enable truly agile BI.I've been bemused by NoSQL, the movement that propounds database-management diversity. Is it now similarly time for a NoETL movement, reflecting a new world of liberated, semantically enriched, analysis-ready, mashable data?

Comment  | 
Print  | 
More Insights
The Agile Archive
The Agile Archive
When it comes to managing data, don’t look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July14, 2014
Our new survey shows growing demand, flat budgets, and CIOs looking to cloud providers -- not to offload services, but to steal ideas.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
Join InformationWeek’s Lorna Garey and Mike Healey, president of Yeoman Technology Group, an engineering and research firm focused on maximizing technology investments, to discuss the right way to go digital.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.