Big Data Forces IT & Business To Get In Sync - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Data Management // Big Data Analytics
09:06 AM
Ron Bodkin
Ron Bodkin

Big Data Forces IT & Business To Get In Sync

Data that benefits the company will slip through the cracks if IT and business groups don't bury the hatchet.

When Steve Phillpott took over as CIO at HGST (formerly Hitachi Global Storage Technologies) in February 2013 he quickly identified a number of important initiatives for IT to create value at the company. One item at the top of the list was to radically improve how development, quality, and manufacturing operations used the vast data produced during the creation and servicing of hard drives.

What quickly emerged during this process was the fact that there would need to be close collaboration between IT and the business to achieve big data results.

Building a roadmap to capitalize on "big data" provides an opportunity for business and technology leaders to come together, thereby avoiding a common challenge played out in many organizations: the estrangement of IT and business groups. Working together, the teams at HGST identified and prioritized opportunities to use data about HGST's hard drives to improve yield, enhance testing, and better serve customers.

[Enterprises are starting to see the light on big data opportunities. Read Big Data Reaches Inflection Point]

To achieve the best results, the leadership team from HGST's business side approved a Big Data Platform (BDP) to serve each of these business groups with commitment from high up to support change, break down data silos, and to measure metrics of business impact.

Hadoop, the foundation of HGST's BDP, is particularly well suited to breaking through data silos. Traditional relational databases store their data in well-defined table structures, and therefore require detailed data modeling before a single row of data can be loaded. Hadoop, on the other hand, simply stores its data as files on its distributed file system, greatly streamlining the data loading process. With so much of HGST’s data coming from legacy databases, the Avro file format preserves the structure of the data -- and accounts for schema changes.

Six months later, the joint IT and business strategy at HGST has started a transformation for how R&D, quality, and manufacturing teams use data for their daily work. Engineers are no longer hamstrung by systems that limit the ability to access and analyze the volumes of detailed data required to develop and refine products and to resolve issues quickly. With the BDP, data on the entire “DNA” of a hard drive -- from development to manufacturing to reliability testing -- is available and accessible at any time.

In addition, the BDP is opening the door for new avenues of yield and operational improvements, by allowing engineers to run large-scale analyses on years of detailed hard drive data. For example, engineers have started to run analytics across test data for millions of hard drives to provide finer-grain understanding of the drive’s components.

As business leaders have come to understand the potential of Hadoop through these early successes, excitement has grown, leading to a slew of new use cases proposed by business teams. Managers are able to analyze device data to understand process delays in test and manufacturing.

Data analytics also makes more relevant information available for people to act on in real-time. HGST engineers can now access data about any device or component at whatever level of detail they require, whereas previously they required data search parties to hunt for data scattered across diverse databases and tape backups, and were limited to summaries of manufacturing data they had collected previously.  

In order to use big data to produce better business results, companies are instituting a number of organizational changes that require close collaboration between business and technology.

Identify data that’s potentially useful for the business, whether it's available internally or externally. Access to internal data often requires IT to move from limiting access for security to encouraging sharing while still governing access to data sets like web logs, customer profiles, and product usage data. Using external data such as online interests, demographics, web crawl, or social activity data requires an investment in purchasing data and a commitment to test and learn how outside data can help improve the business.

Use data science to understand the signal -- the ability to better predict behavior and identify the impact to your business by acting on it -- that's buried in the noise (complex data sets). For instance, my company, Think Big Analytics, has worked with some large hedge funds and financial payment providers to identify the signal in novel data sets like news, reports, social activity, and consumer financial transactions to identify fraud and to improve returns in financial investing. The data scientist is truly an interpreter and facilitator between the complex world of new data and the needs of the business.

Continuously improve operational analytics across the business. The promise of big data is to have technology experts create new capabilities that the business can use to explore and help generate revenue and competitive advantage.

Use automated models that detect data patterns for both strategic and tactical responses. This means using machine learning to continuously update the best response to take automatically in response to events like user or device activity. Moving to a culture of using models instead of human intuition is a small technological step but a big organizational one.

For example, online advertising has migrated from humans defining what ads should run where to a world where participants on online exchanges use automated models to bid for the right to serve ads in real-time.

You can use distributed databases without putting your company's crown jewels at risk. Here's how. Also in the Data Scatter issue of InformationWeek: A wild-card team member with a different skill set can help provide an outside perspective that might turn big data into business innovation. (Free registration required.)

Ron Bodkin is the founder of Think Big, a Teradata company and provider of independent consulting and integration services specifically focused on big data. The company's expertise spans all facets of data science and data engineering and helps customers to drive maximum ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
4/1/2014 | 7:19:00 AM
Beyond Traditional Big Data Focus
Good post describing the value of having information in one place; the mantra of data warehouses and data marts.

While Hadoop makes it easier to warehouse data (due to flexible schema model), effective analytics across disparate data sources still requires defining data semantics, data mapping, and master data sources. Don't forget these important foundational building blocks. 


In a recent workshop with industry IT practitioners, focus was on the little data problems.   The following problems will inhibit scaling little data to big data:
  • Uneven data management maturity across the organization
    • Emerging master data management practices
    • Minimal identification of single source of truth
    • Little agreement on core data entity representation
  • Enterprise Information sharing platform not in place
    • Fragmented data silos and data repositories
    • Ad hoc, project-level data integration
    • Limited data virtualization and data services
    • Proliferation of unknown Excel spreadsheets


IN addition to copying legacy data, some BDP implementation roadmaps tie directly into business activity message streams and don't wait for buik copies.



2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll