Recalling my college student days, I always adored professors who thoroughly walked through what syllabus topics would appear on an upcoming test. That effort made it easier to prepare, and even spark ideas on how a study group could tackle the course material.
Fast forward to adulthood, and I'm always searching for guidance in how to do multistep tasks, wishing for teamwork on coordinating tasks.
I'm sure analytics practitioners feel a similar pain. As analytics adopts machine learning initiatives, analysts are feeling tension from growing two skill sets in their professional hip pockets -- one, from advanced statistics, and the other from mining data. Understanding where to apply those skills in a business initiative has become, well, a high-pressure final exam that no one wants to fail.
Enter DataOps, a general methodology to design, implement, and maintain a distributed data architecture. Its purpose is to provide teams with a process framework to manage quality improvements in data and to reduce cycle time for data analysis. DataOps blends software development technique and agile management principles into analytics pipelines based on the data. DataOps is a framework -- a "syllabus", if you will -- for managers and analysts to know what skills can best manage the given data and what to do based on the conditions associated with the data.
The practice of DataOps is based on a variation of DevOps. In DevOps software developers and operation specialists are brought together under agile management to quality control software development. With DataOps data scientists and operation specialists work together to manage similar tactics, except the focus is on data and associated solutions from communication apps to analytics software.
That last sentence may elicet a "wait-a-minute" feeling -- aren't data scientists and analytics practitioners supposed to work with others professional, quickly responding to analytic needs anyway?
The short, abridged answer is yes. But we have to look at the unabridged version to understand the practices behind DataOp's value.
Yes, people ultimately come together around data and metrics, all in the name of advocating real-time decisions. But the traditional usage of analytics held a reporting perspective -- metrics that noted past activity. Today, as machine learning is increasingly applied in operations, business intelligence must include the prevention of occurrences, especially avoiding bad decisions based on unintended yet catastrophic biases. Thus, metrics must include statistical approaches to data for predicting actions, instead of reflecting only past activity. Data is no longer just observational information, as it may have been in the context of website analytics.
That changed perspective demands more rigor throughout a data-driven organization. Statistics in models are far more exacting in their data requirements, creating a wider scope of decision making and actions people face.
A wider scope of data types also exists. Data can appear in different type structures, spurring a dizzying array of methodologies for specialists to use to analyze it.
A DataOps initiative establishes an optimization environment buoyed by continuous analytics and a multi-discipline specialist team. DataOps allows a smart scale for reducing error in data input, assessing business performance through data-based models and managing version control for visualizing results.
So, what companies are crushing it in DataOps? The topic itself is still fresh-out-of-the-wrapper new. The phrase DataOps was first coined in 2015 according to Wikipedia. But a good place to look is at companies that are leveraging predictive analytics in their operations -- the Netflix of the world. These companies are bringing a new level of disruption to their industries.
Business growth is no longer just through physical scale. Optimizing business performance through models is now central in scaling a business. Large acquisitions still abound in many industries, such as the recent CVS-Aetna announcement. But acquisitions have historically been hampered by misaligned business culture and slow technology adoption. In contrast, DataOps manages cultural transformation into a data-driven organization. Being data driven is a holy grail in business, especially as business leaders now view data and analytics changing the nature of industry competition. A McKinsey study highlighted this view in detail.
Ultimately analytics has become collaborative and cross-functional, but with a specific eye for data lifecycles as much as that for product or process lifecycles. That eye will be highly scrutinized in 2018, as regulatory treatment of data, such as GDPR, takes the spotlight. Higher quality data will strengthen a business preparing its compliance, as well as its performance. That will put any firm embracing DataOps at the top of the class.