Presenting a unified front pays off. Take the European Union. Some member countries are poor as church mice, but united the EU is an economic force to be reckoned with, posting a 2003 GDP of $11.5 trillion--greater than the United States' 2003 GDP of $10.4 trillion.
Enterprise data is a lot like a group of countries. The facts and figures stored in various locations--in your ERP and CRM systems, for example--are valuable individually. But in aggregate, they present a priceless holistic picture of your business. If knowledge is power, this kind of worldview can make your company a force to be reckoned with as well.
At the very least, never again will Bob's shipping address show up on Mary's widget order.
Here's a golden opportunity for IT to contribute to the bottom line by enabling the data retrieval from many sources through a single point of access. The technology needed to perform this unification--formerly known as federated data access and recoined as enterprise information integration, or EII--has been around for some time, providing virtual views into heterogeneous data sources. Not confined merely to structured sources, such as RDBMSs, EII platforms can federate access to nonstructured data, including XML files; Web services; CSV files; enterprise applications, such as PeopleSoft, SAP and Siebel; and Excel spreadsheets. They also can transform structured data into hierarchical XML documents. Some can even render database-query results as HTML.
Why not use data warehouses instead? Although similar to EII products in practice, data warehouses play a vastly different role. Whereas EII works with data in real time, data warehouses are designed for historical and analytical applications. Data warehouses require data replication, which in turn requires a host of applications and processes to support the replication, cleansing and categorization of data as it is pulled from corporate sources and pushed into a warehouse. EII products, in contrast, open a window on raw data while leaving it in its place.
But don't scrap your warehouse yet: Although some EII platforms can replicate data, that is not their primary purpose. And though EII products can do many of the tasks required to create and maintain a data warehouse, they cannot replace a large-scale warehouse because of EII's focus on real-time integration and lack of comprehensive ETL (extract/transform/load) functionality.
Another term commonly heard in the same breath as EII is virtual database. The implication is that database tables--such as orders, customers and inventory--from multiple sources will be magically accessible over a virtual database, represented by an EII platform. Rather, virtual databases are containers, like physical databases, that group data constructs, such as tables and views, and provide an interface for application and developer access. Nice, to be sure, but not the be-all and end-all of EII.
Give the People What They Want
A number of benefits are driving EII adoption. Some are business-related, while others are focused solely on IT.
It's a classic struggle: Business users want continuous, immediate access to 360-degree views of customers and other enterprise data. Within NWC Inc.--our Web-based widget manufacturing company and 24/7 business-applications lab (see inc.networkcomputing.com)--we want a consolidated view of widget orders, but we need to pull data from three unique sources, both structured and unstructured, to achieve that.
Before and After EII
Click to Enlarge
From an IT perspective, the application development, deployment of client-connectivity tools and increase in network traffic are major stumbling blocks when it comes to giving business users the powerful productivity boost that is ubiquitous, real-time data access.
Let's assume that business needs justify a build-it-yourself approach. Although custom development of applications to provide comprehensive data access isn't impossible--it's done all over the world--the build-it-yourself tack is inefficient and brings hefty ongoing administration and maintenance costs. EII platforms reduce the amount of client software necessary to provide connectivity to data sources, slashing the cost of deployment while making upkeep much easier.
In addition, network traffic can be greatly reduced on local segments by the introduction of EII because large data sets are distilled to only the desired information on the server rather than on the client. For example, to achieve the goal of bringing together data from an Oracle RDBMS and a Microsoft SQL Server sans EII, you would need to retrieve data from both databases, pull the results over the network and then join them on the client. And results wouldn't be delivered to the desktop in a single stream; rather, they'd dribble in, in blocks of 10 or 20. So it would take a whole lot of network traffic to get the right data to the right place.
If you had the right EII platform, this information joining would occur on the server, and only the relevant data would traverse the network and land on the client desktop. Full data streams do still travel from the servers on which the data sources reside to the EII platform, but the typically fatter pipes of a data center should easily handle that traffic flow.
Notice we said the right EII platform. This is important: The market takes two approaches to EII, but only one path provides this cost-based optimization within a core feature set. Products lacking autonomous optimization generally are XML-focused and more concerned with presentation than optimization of retrieval from back-end sources. That makes XML-focused EII platforms a perfect fit for organizations that have made standardization on XML as an interface a top priority, but a less-than-optimal solution for those organizations for which bandwidth and legacy connectivity are problems.