InformationWeek: The Business Value of Technology

InformationWeek: The Business Value of Technology
InformationWeek - Our New iPad App

News In Review

December 8, 1997

Data At Your Fingertips

Microsoft aims spec at making data universally available

By Don Kiely

D ata access should be a mature technology by now, offering easy access to any kind of information. This is generally true for relational databases, but much corporate data is stored in nonrelational formats. It's virtually impossible for a single application to have access to all corporate data. The problem used to be that companies wanted to incorporate legacy data with relational and other miscellaneous data. But now, we need to have access to information of any kind located anywhere.

Since releasing SQL Server and Microsoft Access in the last few years, Microsoft has introduced several data stores that are decidedly nonrelational, such as the hierarchical models in Outlook and OLE structured storage. Sometimes they create an Open Database Connectivity (ODBC) driver for such data, such as their text and Excel drivers, but these are kludges at best. Since the underlying data is not relational, the driver has to perform machinations to make it appear relational to an application using the driver.

There are essentially two broad solutions to the problem. The first is to physically move all data to a single database structure that all applications access. This reaction focuses on the database rather than data access, since all data is accessed the same way. This approach is generally favored by IBM and Oracle, big database vendors with proprietary formats.

But this method requires moving huge amounts of data, much of which has to be duplicated. Further, some data doesn't lend itself to relational databases. Finally, you may not own all the data, such as information accessed over the Internet.

The other solution is to develop universal-access techniques that can use a single high-level API to access data wherever and however it's stored. This solution focuses on programmatic data access rather than the database, an d it allows continual innovation in data store formats. The API could be extended or adapted to new formats as they are developed. This is the approach favored by Microsoft and its Universal Data Access (UDA) specification.

If you have many applications that access data stored in many different formats, a gloom is probably settling on you as you realize that the transition to a whole new data-access strategy is going to mean a lot of late nights and long weekends. The good news is that UDA is built on technologies that Microsoft has gradually introduced over the past few years.

Still, UDA isn't a rebundling of a bunch of existing technologies with new names. It's actually a combination of old and new, designed to make it easy to continue using data- access methods that work for you now, particularly for data in relational databases. Microsoft is remarkably adept at repackaging and extending its own technologies and those of others, as it's now doing with UDA.

There's an amazing amount of outdated UDA information floating around, even on Microsoft's Web site. Part of the problem is that Microsoft is incorporating several relatively immature technologies into UDA, and their pre-UDA descriptions will persist until all the details are worked out.

UDA allows application access to any type of data (see chart, right). It's designed for both standalone and Web applications. For browser apps, UDA supports client applications written in a Web script and server programs that deliver static HTML pages to the client.

While ActiveX Data Objects and OLE DB are the central technologies in UDA, you don't have to run right out and learn them to use this new spec. UDA does not preclude your continuing to access relational data with data access objects (DAOs) used with the Access Jet engine, remote data objects (RDO) as a thin layer over ODBC, ODBCDirect, or even the ODBC API directly.

Microsoft says it will continue to support these interfaces. Given the company's past behavior, we can't expect that to go on forever, but it's a good bet that it will support these interfaces until UDA is mature and well-accepted.

UDA lets you access relational, nonrelational, and mainframe data. Typical relational databases supported include Oracle, SQL Server, and FoxPro. Most of these have some extensions for storage of random, nonrelational data, but mostly they are the typical collection of databases, tables, and recordsets. You can continue to access these databases through ODBC or ODBC wrappers, or move to OLE DB, Microsoft's database-access system developed for use with OLE technology.

Accessing nonrelational data-in E-mail, or directories-is the evolutionary part of the UDA design. Working through an OLE DB provider that's similar in concept to an ODBC driver that provides low-level access services, apps can get access to any kind of data, in any format, with any type of structure. Someone just has to take the time to write an OLE DB provider for that data store.

The third type of data, located in legacy systems, could be relational or not, but would be accessed through OLE DB as well.

Component Strategy
The primary components of UDA are ActiveX Data Objects, OLE DB, and ODBC, with Remote Data Services (formerly called Active Data Connector or ADC) for distributed data access in browsers. These are all relatively new APIs from Microsoft but are built on earlier technologies, particularly OLE and the Component Object Model.

The primary interface between all UDA applications and the data is OLE DB, the new system-level programming interface to data. OLE DB is a component database architecture, a set of OLE interfaces for access to data regardless of location or type. The benefit to using OLE DB as a layer in data access is that it provides one consistent API for access to all data, regardless of source, simplifying application progr amming in the same way ODBC simplified programming for relational data (see chart, left).

One concern about using OLE DB and ODBC together to access relational data is that it adds yet one more layer to the mix, which causes a performance degradation. But in theory, performance should be similar to ODBC as it is today, because OLE DB replaces the ODBC Manager's role, substituting one layer with another. This takes data abstraction one step further, using ODBC as a single interface to all relational data and OLE DB as a single interface to all data of any kind.

OLE DB is a C++ OLE specification and is mind-numbingly complex. Fortunately, you are unlikely to ever need to use OLE DB directly for application programming, although you certainly have that option. The only people likely to need to deal with it are Microsoft employees and anyone who writes an OLE DB provider or data store engine. Most applications will access its features through ADO, a thin wrapper around OLE DB.

ADO is designed to be th e application-level programming interface to data, with a disarmingly simple object model (see chart, below). While generally similar to other object hierarchies from Microsoft, it uses a much more flexible nonhierarchical scheme that will take some getting used to. The individual objects and collections, except for the Error and Field objects, can exist independently of the other objects and can be connected to multiple objects in the hierarchy, making them separately creatable and portable. For example, you can create a Parameters collection for use with a Command object, then use the same independent Parameters collection with another Command object, saving the chore of recreating the set of parameters.

Making The Connection
The Connection object is a link to the data source. If the command you issue through the connection returns rows of information, it creates a Recordset object. The Command object is a quer y or statement that the data source can process. This object will most commonly be used with relational databases, which may or may not accept parameters that can be passed using the Parameters collection. Because not all data providers support command execution, this object is optional.

The Recordset object is the most complex in the ADO hierarchy, since most of the features of cursors are contained here. It is quite similar to the recordsets in DAO and RDO but is streamlined and extended with optional features. Error collections are created and returned when errors occur within ADO or OLE DB and are passed back to ADO.

The flexibility of accessing data with ADO lies in the Properties collection of Property objects. You can create and then attach a Properties collection to any of the Connection, Command, and Recordset objects. This allows the ADO service provider to expand ADO for custom features. Herein lies the real power and flexibility of ADO in the UDA model for accessing any kind of computer d ata.

ADO properties can be either built-in or dynamic. Built-in properties are implemented with ADO itself and are available to any new object created in ADO. Built-in properties don't appear as Property objects with a Properties collection. Dynamic properties are custom properties defined by the underlying data provider. This could confuse a programmer and entails some work to make an application flexible enough to handle any dynamic properties it finds. This is one area where Microsoft needs to clarify the specification.

The final link in the UDA system is the Remote Data Service. RDS is a data connection and publishing framework for applications hosted in a browser, namely Internet Explorer, using HTTP, its secure sibling HTTPS, and Distributed COM protocols. Microsoft has absorbed RDS into ADO.

RDS is a connectionless protocol, required for data access over the Internet. The Net is a huge network with many people accessing widely distributed data, and has new types of data that traditional cli ent-server apps haven't needed to deal with. Before the Internet, an application could establish a persistent connection to a local or networked database for as long as the application needed the connection. By design, the Net is hostile to such persistent connections, and applications must be able to handle packets of information received sporadically. This is why technologies such as COM have taken so long to be adapted to the Internet, as they undergo fundamental changes. In essence, RDS does the same thing over the Internet that ADO does for traditional networked applications.

Universal Acceptance
UDA may provide universal access to data, but it is not yet universally accepted by the industry. Some aggressive and longtime Microsoft database partners, such as Intersolv, International Software Group, Informix, and Sybase, have endorsed UDA and plan to adapt components such as ODBC drivers to the ADO/OLE DB framework. But many others are hedging their bets, with IBM, Informix, and Oracle purs uing object relational databases for managing complex data. Oracle plans to support a basic OLE DB provider for its databases but doesn't plan to fully embrace it, because it's too generic an interface and doesn't expose some of the sophisticated database features.

One criticism of UDA is that it ties developers and enterprises to Windows. Though Microsoft counters by saying that it will support UDA on any platform that supports OLE, for now, that is only the various versions of Windows. The other problem is that most companies favor an open standard, which UDA is not.

Microsoft's definition of a standard is a published specification that appears to be widely supported. The industry at large, however, is most comfortable when a standard is under the control of an independent standards body that relies on an open public process.

Microsoft's goal of high-performance, universal data access to all enterprise data stores, with a reasonable migration path and broad industry support, is one of the toughe st it has set for itself. Broad industry support is far from a given, and the rate of innovation in data stores is overwhelming. But even if Microsoft can get all of its own proprietary data stores in line, this will be a major accomplishment worth pursuing.

Don Kiely is the director of technology for SkyFire Group, a Windows and Internet application development firm. He can be reached at donkiely@computer.org .


Back to News in Review

Send Us Your Feedback

Top of the Page


Get InformationWeek Daily

Don't miss each day's hottest technology news, sent directly to your inbox, including occasional breaking news alerts.

Sign up for the InformationWeek Daily email newsletter

*Required field

Privacy Statement



This Week's Issue

Technology Whitepapers

Featured Reports







Video