Data, Data, Everywhere - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Hardware & Infrastructure
06:10 PM
Connect Directly

Data, Data, Everywhere

Megaterabyte databases are getting downright common. But with more real-time data, complex queries, and increasing numbers of sources, managing them is anything but routine.

There's no Moore's law to sum up the growth curve of databases. But here's a rule of thumb: The amount of data stored by businesses nearly doubles every 12 to 18 months. And the very biggest--those at or near the 100-terabyte mark--probably triple every three years.

But databases aren't just getting bigger. They're also becoming more real time. Wal-Mart Stores Inc. refreshes sales data hourly, adding a billion rows of data a day, allowing more complex searches. EBay Inc. lets insiders search auction data over short time periods to get deeper insight into what affects customer behavior. Data also is coming from increasingly complex sources: Radio-frequency identification readers now feed data to Wal-Mart, and Nielsen Media Research, in collecting info on TV-viewing habits, is getting data from TiVos along with the standard living-room set.

Businesses don't run the biggest databases in the world. That honor is reserved for the Stanford Linear Accelerator Center, NASA's Ames Research Center, and other government groups such as the National Security Agency, which run databases in the petabyte (1,000-terabyte) range. But because businesses run fast-response systems that need to quickly get data in and answers out, they're solving some of the most interesting problems in data management.

Businesses are dealing with the complexities of engineering databases that combine historical and real-time data from multiple sources. Designing and building the hundreds, even thousands, of tables that make up multiterabyte databases and the queries used to extract useful knowledge can test the technical and management skills of any database administrator. But the advantages of big databases are obvious: Most of the largest are data warehouses for analytical tasks where more, and more-detailed, data means better insights. With real-time or near-real-time data, the value of those insights increases exponentially. "We know how many 2.4-ounce tubes of toothpaste sold yesterday, and what was sold with them," says Dan Phillips, Wal-Mart's VP of information systems.

Business As Usual At Wal-Mart
No company better illustrates the advantages of leveraging massive volumes of data for competitive advantage than Wal-Mart, which operates a data warehouse with, at last count, 583 terabytes of sales and inventory data built on a massively parallel 1,000-processor system from data-warehouse-technology vendor Teradata, an NCR Corp. subsidiary. While some companies might consider having more than half a petabyte of data overkill, at Wal-Mart it's the way to do business.

"Our database grows because we capture data on every item, for every customer, for every store, every day," Phillips says. Wal-Mart deletes data after two years and doesn't track individual customer purchases, he says.

By refreshing the information its data warehouse holds every hour--1 billion rows of data or more are updated every day--Wal-Mart turned its data warehouse into an operational system for managing daily store operations. Store managers used to query the database at the end of the day to see what was selling at their location. Now they can check hourly and see what's happening at stores throughout a region that might be experiencing an unusual event such as a snowstorm or hurricane.

Phillips tells the story of how IT staff at Wal-Mart's Bentonville, Ark., headquarters tapped into the data warehouse the morning after Thanksgiving three years ago and noticed that East Coast sales of a computer-monitor holiday special were far below expectations. Marketing staff contacted stores and learned the computers and monitors weren't being displayed together, so potential buyers couldn't see what they were getting for the posted price. Calls went out to Wal-Mart stores across the country to rearrange the displays. "By 9:30 a.m. Central, the pace of sales could be seen picking up in our data," Phillips recalls.

Blurring The Lines
Data's usefulness is rarely so clear cut. And Wal-Mart's capabilities are beyond the scope of most businesses. But its reliance on data for day-to-day business decisions is being emulated elsewhere, particularly in retail, telecommunications, financial services, and manufacturing.

The dividing line between operational and historical data isn't as firmly drawn as just a few years ago, says Bill O'Connell, chief technology officer of IBM's data-warehouse and business-intelligence business. "You're seeing a blurring of the lines between operational and strategic systems," he says. But that means the two must be carefully engineered to work together, which complicates the life of the database administrator even more.

EBay learned a big-database lesson or two as it rapidly grew into the world's largest online auction house. "We started in 1999 and 2000 with one monolithic Oracle database," says David Pride, VP of information management and delivery. "Since then, we've done a series of splits that let us scale out horizontally" into several hundred databases totaling 100 terabytes of data.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
2021 State of ITOps and SecOps Report
2021 State of ITOps and SecOps Report
This new report from InformationWeek explores what we've learned over the past year, critical trends around ITOps and SecOps, and where leaders are focusing their time and efforts to support a growing digital economy. Download it today!
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

Remote Work Tops SF, NYC for Most High-Paying Job Openings
Jessica Davis, Senior Editor, Enterprise Apps,  7/20/2021
Blockchain Gets Real Across Industries
Lisa Morgan, Freelance Writer,  7/22/2021
Seeking a Competitive Edge vs. Chasing Savings in the Cloud
Joao-Pierre S. Ruth, Senior Writer,  7/19/2021
Register for InformationWeek Newsletters
Current Issue
Monitoring Critical Cloud Workloads Report
In this report, our experts will discuss how to advance your ability to monitor critical workloads as they move about the various cloud platforms in your company.
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll