Reference Data Management: What, Why, and How - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management

Reference Data Management: What, Why, and How

Razza Dimension Server 5.0 offers a modern approach to reference data management.

At first, hierarchy and reference data management may not sound like something to get excited over — if you even understand what it means right away, since it sounds esoteric and complex. Fortunately, though, reference (or master) data management is exactly what you would imagine it to be: the management of data that typically resides in "master" tables, such as customer, location, product, and, of course, the innumerable "type" tables that clutter up our databases. This data also sometimes assumes the rather fancy alias of "dimensions," particularly in the context of data warehousing.

Mastering the Reference Dimension

Reference data can legitimately claim to be the Rodney Dangerfield of data — it just doesn't get any respect. And frankly, who's too bothered about all those dimensions? It's the facts that we're after. Yet managing reference data — particularly hierarchical reference data such as product and geographic hierarchies — has been the bane of many an application database, whether for custom OLTP applications, ERP, or data warehousing and business intelligence. Architects, modelers, and managers of data know that in the fairyland of data, reference data is the imp: hard to control and constantly creating mischief. And, although it is facts such as dollar sales or revenue that users pursue, these facts are meaningless without the context provided by their dimensions — in other words, the reference data. The number 1,000,000 tells us nothing — until we learn it's a sales amount, stated in currency C, for product X, in the region R of nation N, achieved by salesperson P, in the year Y and month M — each of which is reference data! Reference data is tremendously important because it provides a frame of reference to information, without which the information is meaningless.

Why is reference data so hard to manage? Because there's no single definition of such data. The profusion of system-of-record applications and data sources in today's world leads, more often than not, to multiple sources of reference data, each of which is true to its own domain, but may or may not agree with others. This situation is usually further confounded by a pervasive lack of coordination and standards for reference data, at both the business process and technology levels. Every IT solution that needs reference data typically builds containers and presentation components for it or builds custom bridges to other existing data sources — and thus adds more threads to the spaghetti of reference data. All this complexity, effort, and cost could largely be avoided by defining a single source of truth for reference data. The Razza Dimension Server from Razza Solutions exists to build such a single source of truth. (See Figure 1.)

Figure 1: Raza Dimension Server intervenes between source and reporting/analysis systems to ensure data hierarchies are clean.

Introducing Razza Dimension Server

The stated purpose of the Razza Dimension Server (hereafter, RazzaDS) is "to greatly simplify master data harmonization across multiple enterprise systems." RazzaDS is conceptually a simple solution. Recognizing that most dimensions are basically a hierarchy, RazzaDS lets users develop their own hierarchies of business dimensions, without regard for the underlying metadata aspects such as data types and lengths. For example, say that users want to create a geographical hierarchy consisting of countries, regions, states, and postal codes. The usual data modeling approach would be to define these as entities, connected by identifying or nonidentifying relationships. In RazzaDS, users do away with all metadata concerns and directly enter the hierarchy values into a simple, intuitive, hierarchical folderlike user interface. (See Figure 2.) Thus, users could directly enter the value "USA," followed by four child regions: West, Midwest, South, and East. Then, under the South region, users could enter states, such as Florida, Georgia, and Louisiana. Finally, postal codes would be entered under each of these states. The nodes (called either limbs or leaves in RazzaDS) have properties, which characterize the nodes. (Actually, properties can also be defined at the hierarchy or version level. More on versions later.)

Figure 2. Users enter hierarchy values into a simple, folderlike user interface.

One of the strongest features of RazzaDS is that properties can be local or global, primary or derived (my terms), declared or inherited, newly defined or predefined (such as for Essbase), and more. In data modeling terms, properties are the non-key attributes of the dimensional entities.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

New Storage Trends Promise to Help Enterprises Handle a Data Avalanche
John Edwards, Technology Journalist & Author,  4/1/2021
11 Things IT Professionals Wish They Knew Earlier in Their Careers
Lisa Morgan, Freelance Writer,  4/6/2021
How to Submit a Column to InformationWeek
InformationWeek Staff 4/9/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Successful Strategies for Digital Transformation
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll