Let The Data Flow

Integrating operational data has long been one of the toughest IT problems. One choice is ripping and replacing legacy systems and consolidating on a single application platform, but that's neither affordable nor practical. You'll find a better answer in business integration convergence, a trend toward combining techniques including metadata repositories and discovery, data modeling and data quality services, ETL and EII, and integration with message brokers and ESBs for timely data synchronizat

Build a Shared Business Vocabulary

One critical aspect of MDM is data definition: You must be sure all entities become master data only through common definitions, names and integrity rules. Like it or not, different versions of master data must be maintained. The SBV tracks changes to master data definitions. Presenting it in XML form in a metadata repository allows data modeling tools, data quality tools and integration software (including message brokers) to work with the SBV in standard ways.

Learning what data definitions are out there will give you an indication of how many master data versions exist in different systems. Along with identifying definitions, look at the relationships between them to refine your understanding of which definitions are referring to the same customer name, for example. Informatica's SuperGlue and IBM's Rational Data Architect are useful tools for this exercise.

The next steps are to map the disparate data to your master data definitions; sample data sources to get a profile of their data quality; and then, create rules to cleanse and transform the data. Now you're ready to consolidate master data. Look for differences in metadata among the sources, which must be mapped to the SBV definitions. When all this is captured in a metadata repository, you'll be able to generate "artifacts" — EII views, BI tool views, message broker XSLTs and so on — that deliver application-specific master data versions still faithful to the common SBV (see "Common Vocabularies," at right). Marked up in XML, master data can flow through an ESB, which lets the data remain consistent wherever it goes. This approach levels the playing field for all users of the common resource. Also, relational database views, translated into XML views, may be queried by systems using X/Query. And, using the same procedure for unstructured data will bring more resources into the mix.

Remember to look not only at the master data but at the services. Processes must handle the master data's maintenance, auditing and synchronization until your organization can remove redundant master data versions and replace the logic to update that data with calls to common master data services. And as master data is consolidated, it will likely become the source for data warehouses and dimensions, not just operational systems.

MDM and ESB Working Together

Whether you manage master data separately from any application or designate one application as the master, you still must deal with a classic data-management problem: What happens when operational applications conflict while trying to update the master data? One way to avoid this is to link users' enterprise portals to the ESB. Then, as other application user interfaces are re-engineered to integrate with the portals, these portlets will automatically be linked with the ESB and the common data services, as in "Maintain Master Data," below.

As the messaging, integration and process engine of a service-oriented architecture (SOA), the ESB touches application services as well as BPM tools. When data is entered via the portal, it can move through the ESB to reach appropriate apps. Linked to MDM, the ESB will have addressed the data quality issues and master data changes before the data reaches the operational application packages and data warehouses.

With the ESB architecture, changes to master data can "trickle feed" — that is, refresh continuously and periodically — changes into BI systems. The event-driven nature of ESB and message broker integration can regulate the movement of master data changes into the data warehouse, with data quality tools acting as the firewall to ensure accuracy and completeness. Data synchronization functions in ESB and other application middleware can keep operational systems up to date.

With tools maturing, you have many options for integrating information inside and outside your enterprise. All have roles to play for business integration, including the development of a master data resource that can reduce the headaches of sharing and synchronizing data among operational systems. MDM also can simplify SOA information integration and employ the Web architecture to keep MDM current.

Don't forget the importance of common metadata, something organizations have been struggling with since information systems were invented. Master metadata, in the form of an SBV, is the secret to developing and implementing data integration and sharing functions that deliver the biggest business benefits to the enterprise.

Mike Ferguson is managing director of Intelligent Business Strategies, which specializes in IT analysis and consulting. He focuses on enterprise BI and business integration. Contact him at [email protected].