Why You Need a Data Fabric, Not Just IT Architecture
Data fabrics offer an opportunity to track, monitor and utilize data, while IT architectures track, monitor and maintain IT assets. Both are needed for a long-term digitalization strategy.
As companies move into hybrid computing, they’re redefining their IT architectures. IT architecture describes a company's entire IT asset base, whether on-premises or in-cloud. This architecture is stratified into three basic levels: hardware such as mainframes, servers, etc.; middleware, which encompasses operating systems, transaction processing engines, and other system software utilities; and the user-facing applications and services that this underlying infrastructure supports.
IT architecture has been a recent IT focus because as organizations move to the cloud, IT assets also move, and there is a need to track and monitor these shifts.
However, with the growth of digitalization and analytics, there is also a need to track, monitor, and maximize the use of data that can come from a myriad of sources. An IT architecture can’t provide data management, but a data fabric can. Unfortunately, most organizations lack well-defined data fabrics, and many are still trying to understand why they need a data fabric at all.
What Is a Data Fabric?
Gartner defines a data fabric as “a design concept that serves as an integrated layer (fabric) of data and connecting processes. A data fabric utilizes continuous analytics over existing, discoverable and inferenced metadata assets to support the design, deployment and utilization of integrated and reusable data across all environments, including hybrid and multi-cloud platforms.”
Let’s break it down.
Every organization wants to use data analytics for business advantage. To use analytics well, you need data agility that enables you to easily connect and combine data from any source your company uses --whether the source is an enterprise legacy database or data that is culled from social media or the Internet of Things (IoT). You can't achieve data integration and connectivity without using data integration tools, and you also must find a way to connect and relate disparate data to each other in meaningful ways if your analytics are going to work.
This is where data fabric enters. The data fabric contains all the connections and relationships between an organization’s data, no matter what type of data it is or where it comes from. The goal of the fabric is to function as an overall tapestry of data that interweaves all data so data in its entirety is searchable. This has the potential to not only optimize data value, but to create a data environment that can answer virtually any analytics query. The data fabric does what an IT architecture can’t: it tells you what data does, and how data relates to each other. Without a data fabric, companies’ abilities to leverage data and analytics are limited.
Building a Data Fabric
When you build a data fabric, it’s best to start small and in a place where your staff already has familiarity.
That “place” for most companies will be with the tools that they are already using to extract, transform and load (ETL) data from one source to another, along with any other data integration software such as standard and custom APIs. All of these are examples of data integration you have already achieved.
Now, you want to add more data to your core. You can do this by continuing to use the ETL and other data integration methods you already have in place as you build out your data fabric. In the process, care should be taken to also add the metadata about your data, which will include the origin point for the data, how it was created, what business and operational processes use it, what its form is (e.g., single field in a fixed record, or an entire image file), etc. By maintaining the data’s history, as well as all its transformations, you are in a better position to check data for reliability, and to ensure that it is secure.
As your data fabric grows, you will probably add data tools that are missing from your workbench. These might be tools that help with tracking data, sharing metadata, applying governance to data, etc. A recommendation in this area is to look for an all-inclusive data management software that contains not only all the tools that you'll need build a data fabric, but also important automation such as built-in machine learning.
The machine learning observes how data in your data fabric is working together, and which combinations of data are used most often in different business and operational contexts. When you query the data, the ML assists in pulling the data together that is most likely to answer your queries.
It’s difficult for many organizations to develop data fabric elements like machine learning “from scratch.” This is where data management software helps because it usually includes already automated, built-in machine learning that you can use in your data fabric.
Summary
Data fabrics offer an opportunity to track, monitor and utilize data while IT architectures track, monitor and maintain IT assets. Both are needed for a long-term digitalization strategy.
The data fabric development can start on a small scale, such as a specific business area or a use case. In most cases, IT can use data integration tools it is already familiar with, together with a data management system that can automate many of the data fabric building functions that IT is less familiar with.
The end goal should be an IT architecture that tells you where every IT asset is and what it does; and a data fabric that tells you everything you want to know about the data in that infrastructure.
What to Read Next:
CIO Agenda for Right Now: Priorities a Year Into the Pandemic
About the Author
You May Also Like