Kimball University: Data Stewardship 101: First Step to Quality and Consistency
Data stewards are the liaisons between business users and the data warehouse team, and they ensure consistent, accurate, well-documented and timely insight on resources and requirements.
Roles and responsibilities may vary depending on whether the steward is responsible for dimension tables, fact tables or both. In general, a data steward must:
• Become familiar with the business users and their various usage profiles to convey requirements and ease-of-use concerns to the data warehouse project team.
• Understand business requirements and how the data supports those requirements to help users leverage corporate data.
• Develop in-depth knowledge of the structure and content of the data warehouse--including tables, views, aggregates, attributes, metrics, indexes, primary and foreign keys, and joins--to answer data-related questions and enable a broader audience to analyze the data directly.
• Interpret new and changing business requirements to determine their impact on data warehouse design and to propose enhancements and changes to meet these new requirements.
• Analyze the potential impact of data definition changes proposed by the business and communicate related requirements to the entire data warehouse team.
• Get involved early in source-system enhancement or content changes to ensure that the data warehouse team is prepared to accept these changes.
• Comply with corporate and regulatory policies to verify data quality, accuracy and reliability, including establishing validation procedures to be performed after each data load and prior to its release to the business. Stewards must withhold new data and communicate status if significant errors are identified.
• Establish and perform data certification processes and procedures while exercising proper due diligence in ensuring compliance with related corporate and regulatory requirements.
• Provide metadata that describes the data, offers a business description/definition and identifies the source data element(s) and any business rules or transformations used to deliver the data.
In addition to all of the above, data stewards who are specifically responsible for conformed dimensions for the enterprise must help forge agreement on their definition and use in downstream analytic environments. They also must determine departmental interdependencies and ensure that the conformed dimensions meet business needs across business processes and departments. In some large organizations, developing consensus on conformed dimensions can be a significant political challenge, so data stewards need to communicate and coordinate with the other stewards, reach agreement on data definitions and domain values, and minimize conflicting or redundant efforts. When conflicts arise, stewards must get data warehouse senior management sponsors involved to resolve cross-departmental issues.
Stewards who support fact tables must ensure that conformed dimensions are used in their creation to avoid redundant or nonconforming tables. They must also ensure that any metrics used in multiple fact tables are conformed across business events. Finally, they must understand any consolidated or aggregate tables built on their fact tables, and they must put processes in place to remove aggregated tables that may be invalidated by slowly changing dimensions.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.