Welcome Guest. | Log In| Register | Membership Benefits

News In Review

May 3, 1999

XML Shows Great Promise For Server Development

Standard provides interchange format for distributed apps

By Don Kiely

The Internet has forced business developers to change to the way they build applications. Distributed programs are increasingly the norm, cooperatively processing data on Web servers, databases, and legacy systems. A single application can have its component parts running on Windows, Unix, Linux, and OS/390 systems--a nightmare for developers and system administrators.

Such cross-platform applications have intensified the issue of how to share data efficiently between applications and between components. Every application used to have its own proprietary data format, forcing developers to spend a major part of their time translating data as it moved around a network between applications.

The World Wide Web Consortium's emerging Extensible Markup Language standard provides a portable data interchange format that's rapidly finding its way into many software products, such as Microsoft's ActiveX Data Objects, the Oracle8i database, and Web browsers. In most cases, however, such applications incorporate XML as a native import/export format while maintaining proprietary internal formats and requiring constant translation between proprietary and standard XML formats.

Most of the attention paid to XML so far has been focused on client-side document management, providing hierarchical nested fields in randomly complex data structures. But there are two other common scenarios for using XML data that go beyond such document-centric uses. XML can enhance Web pages by extending HTML to deliver semantic meaning, rather than just formatted data. XML will also give Web pages intelligence about the information they contain, making possible sophisticated searches that can distinguish multiple word meanings.

XML's greatest promise is its potential to revolutionize distributed applications by providing a standard data-interchange format. While documents are generally meant to be human-readable, application data is machine-readable.

Rather than being focused on sets of data, application data is record-oriented. Application data is transmitted in text form with tags that travel with the data. This means that applications don't need custom parsing code but can use any of the widely available XML parsing engines available from Microsoft and IBM.

Multitier Architecture
The most popular model for distributed applications is a three-tier architecture, with XML playing a role in each tier. The user-interface tier can use XML to deliver "smart data" to an application so the client can do more with it without returning to the server and tying up the network. The same data can be loaded into various applications for different purposes.

A user, for example, can look at the same data in an Excel spreadsheet in a row-and-column format for analysis, in accounting software for reporting, or in a Web page for viewing by customers and staff. Each use lets the user search, query, and otherwise manipulate data, letting the client use its own resources to process data.

Server-side XML data is an efficient storage format for highly structured hierarchical data. Structured data is best stored in a hierarchical format because it doesn't have to be translated from other formats when it's needed. For example, structured data in a relational format has to be decomposed into component parts before it's served up in the hierarchical format of XML.

In the middle tier of a three-tier application, which commonly holds objects containing the business rules and links to underlying data stores, XML can serve to integrate data for tiers above and below it. It is in this tier that XML data shines as a common data format.

This raises the question of how and where XML data should be stored. There hasn't yet emerged an XML persistence format that is ideally suited for its hierarchical structure. At its worst, force-fitting XML data into a relational database is a kludge. Relational databases use a row-and-column metaphor in which a given cell may or may not contain data. XML data is hierarchical in nature, a format that suits some kinds of relational data but is highly inefficient for most. It's possible to create XML documents from relational data with run-time joins, but the process is slow and doesn't scale well.

Better Storage Scheme
Object databases may be a better fit for XML data because they provide a scheme for storing information about the data analogous to the way that XML tags give meaning to data. But object databases haven't become popular, and it isn't yet clear that using them to store XML data is any better than using a relational system.

Such issues have given rise to XML data servers that store and manipulate native XML data as efficiently as other data stores. XML servers can provide high-performance queries over large sets of XML components and integration with different data sources in a unified XML storage area. Furthermore, they can provide standard interfaces to common Web development languages--including JScript, VBScript, and Java--and provide a single point of administration for XML data.

XML data servers have the benefit of being based on industry standards. They're also scalable and quite flexible. XML is an approved W3C standard, making it portable and safer than proprietary techniques. Like any modern data engine, an XML data server can be built with features that make them more scalable, enabling them to continue to work efficiently as the number of users increase, such as by distributing the workload to multiple dedicated data servers. (See the story on Object Design's Excelon XML.)

Depending on how the underlying data engine is implemented, an administrator could easily add servers as needed while maintaining data integrity and consistency. XML itself doesn't provide these features, but when it's hierarchical data structure is combined with server features, the result can be a high-performance, flexible, and portable source of enterprise data.

The Flexibility Factor
The "X" in XML stands for extensible, giving rise to inherent flexibility in working with data. Adding a new field to a record or document simply requires adding an entry to the data hierarchy and only takes up space if there is data to be stored. Changing the data structure in a relational database is far more complex, as any database administrator can attest.

XML will come into its own when it becomes a native data storage format and not simply an intermediate format for exchanging data between applications. It's unlikely that any of the major database systems from IBM, Microsoft, Oracle, and Sybase will store native XML data any time soon, but any of the new XML data servers can provide an efficient solution for data that must be used throughout an enterprise.

Don Kiely is director at Information Insights, a Fairbanks, Alaska, consulting firm. He can be reach at donkiely@computer.org.


Back to This Week's Issue

Send Us Your Feedback

Top of the Page