Big Data // Big Data Analytics
Commentary
3/31/2014
09:06 AM
Ron Bodkin
Ron Bodkin
Commentary
Connect Directly
RSS
E-Mail
50%
50%

Big Data Forces IT & Business To Get In Sync

Data that benefits the company will slip through the cracks if IT and business groups don't bury the hatchet.

When Steve Phillpott took over as CIO at HGST (formerly Hitachi Global Storage Technologies) in February 2013 he quickly identified a number of important initiatives for IT to create value at the company. One item at the top of the list was to radically improve how development, quality, and manufacturing operations used the vast data produced during the creation and servicing of hard drives.

What quickly emerged during this process was the fact that there would need to be close collaboration between IT and the business to achieve big data results.

Building a roadmap to capitalize on "big data" provides an opportunity for business and technology leaders to come together, thereby avoiding a common challenge played out in many organizations: the estrangement of IT and business groups. Working together, the teams at HGST identified and prioritized opportunities to use data about HGST's hard drives to improve yield, enhance testing, and better serve customers.

[Enterprises are starting to see the light on big data opportunities. Read Big Data Reaches Inflection Point]

To achieve the best results, the leadership team from HGST's business side approved a Big Data Platform (BDP) to serve each of these business groups with commitment from high up to support change, break down data silos, and to measure metrics of business impact.

Hadoop, the foundation of HGST's BDP, is particularly well suited to breaking through data silos. Traditional relational databases store their data in well-defined table structures, and therefore require detailed data modeling before a single row of data can be loaded. Hadoop, on the other hand, simply stores its data as files on its distributed file system, greatly streamlining the data loading process. With so much of HGST’s data coming from legacy databases, the Avro file format preserves the structure of the data -- and accounts for schema changes.

Six months later, the joint IT and business strategy at HGST has started a transformation for how R&D, quality, and manufacturing teams use data for their daily work. Engineers are no longer hamstrung by systems that limit the ability to access and analyze the volumes of detailed data required to develop and refine products and to resolve issues quickly. With the BDP, data on the entire “DNA” of a hard drive -- from development to manufacturing to reliability testing -- is available and accessible at any time.

In addition, the BDP is opening the door for new avenues of yield and operational improvements, by allowing engineers to run large-scale analyses on years of detailed hard drive data. For example, engineers have started to run analytics across test data for millions of hard drives to provide finer-grain understanding of the drive’s components.

As business leaders have come to understand the potential of Hadoop through these early successes, excitement has grown, leading to a slew of new use cases proposed by business teams. Managers are able to analyze device data to understand process delays in test and manufacturing.

Data analytics also makes more relevant information available for people to act on in real-time. HGST engineers can now access data about any device or component at whatever level of detail they require, whereas previously they required data search parties to hunt for data scattered across diverse databases and tape backups, and were limited to summaries of manufacturing data they had collected previously.  

In order to use big data to produce better business results, companies are instituting a number of organizational changes that require close collaboration between business and technology.

Identify data that’s potentially useful for the business, whether it's available internally or externally. Access to internal data often requires IT to move from limiting access for security to encouraging sharing while still governing access to data sets like web logs, customer profiles, and product usage data. Using external data such as online interests, demographics, web crawl, or social activity data requires an investment in purchasing data and a commitment to test and learn how outside data can help improve the business.

Use data science to understand the signal -- the ability to better predict behavior and identify the impact to your business by acting on it -- that's buried in the noise (complex data sets). For instance, my company, Think Big Analytics, has worked with some large hedge funds and financial payment providers to identify the signal in novel data sets like news, reports, social activity, and consumer financial transactions to identify fraud and to improve returns in financial investing. The data scientist is truly an interpreter and facilitator between the complex world of new data and the needs of the business.

Continuously improve operational analytics across the business. The promise of big data is to have technology experts create new capabilities that the business can use to explore and help generate revenue and competitive advantage.

Use automated models that detect data patterns for both strategic and tactical responses. This means using machine learning to continuously update the best response to take automatically in response to events like user or device activity. Moving to a culture of using models instead of human intuition is a small technological step but a big organizational one.

For example, online advertising has migrated from humans defining what ads should run where to a world where participants on online exchanges use automated models to bid for the right to serve ads in real-time.

You can use distributed databases without putting your company's crown jewels at risk. Here's how. Also in the Data Scatter issue of InformationWeek: A wild-card team member with a different skill set can help provide an outside perspective that might turn big data into business innovation. (Free registration required.)

Ron Bodkin founded Think Big, a leading provider of independent consulting and integration services specifically focused on big data. The company's expertise spans all facets of data science and data engineering and helps customers to drive maximum value from their data ... View Full Bio

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
cobiacomm01
50%
50%
cobiacomm01,
User Rank: Apprentice
4/1/2014 | 7:19:00 AM
Beyond Traditional Big Data Focus
Good post describing the value of having information in one place; the mantra of data warehouses and data marts.

While Hadoop makes it easier to warehouse data (due to flexible schema model), effective analytics across disparate data sources still requires defining data semantics, data mapping, and master data sources. Don't forget these important foundational building blocks. 

 

In a recent workshop with industry IT practitioners, focus was on the little data problems.   The following problems will inhibit scaling little data to big data:
  • Uneven data management maturity across the organization
    • Emerging master data management practices
    • Minimal identification of single source of truth
    • Little agreement on core data entity representation
  • Enterprise Information sharing platform not in place
    • Fragmented data silos and data repositories
    • Ad hoc, project-level data integration
    • Limited data virtualization and data services
    • Proliferation of unknown Excel spreadsheets

 

IN addition to copying legacy data, some BDP implementation roadmaps tie directly into business activity message streams and don't wait for buik copies.

 

 

 
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Must Reads Oct. 21, 2014
InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A roundup of the top stories and community news at InformationWeek.com.
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.