Overstock.com Moves Reports To Data Warehouse

Faced with rapid growth, Overstock.com is building a data warehouse for operational reporting to reduce the strain on its transactional systems.

Charles Babcock, Editor at Large, Cloud

August 3, 2005

3 Min Read

Overstock.com Inc., the "close-out" retailer of brand-name merchandise, is migrating from an Oracle 9i database to Oracle 10g and shifting its reporting from operational systems to a data warehouse.

The latter is necessary because Overstock's business has been growing by 100% a year and its operational systems were staggering under their combined customer transaction and reporting loads, says Shawn Schwegman, senior VP of technology. With 14 million to 18 million hits a month, Overstock's Web site handles an information and transaction load equivalent to Sears' or J.C. Penney's Web sites, he says.

The demands on its transactional systems were becoming so heavy that Overstock found itself "turning off one reporting function after another" to keep customer response times satisfactory. But Schwegman acknowledges that was a dead-end path. "Basically, you're dead in the water if you can't report on operations," he says.

What's more, each time Schwegman turned off a reporting function, "I had the most irate users you've ever seen," he says.

With no overnight window in which to shut down operations and move data, the 24-by-7 online merchandiser had to decide how best to make its migration to Oracle 10g and implement a new reporting system without affecting operations.

That means Overstock needed a high-volume data-management tool to migrate data into Oracle 10g and then load selected data into the new NCR Corp. Teradata data warehouse, Schwegman says. Overstock evaluated data-migration toolsets from GoldenGate Software, Quest Software, and DataMirror. All three offer solid, real-time data migration, Schwegman says, but it selected GoldenGate's transactional data-management tools for their ability to read Oracle database log files and extract the needed data from those files rather than going into the database itself.

"Most ETL [extract, transform, and load] tools add 10% to 15% performance overhead to the production environment. Because GoldenGate can read the Oracle log files, its overhead is less than 4%," Schwegman says.

Overstock uses 10 Oracle database clusters, with the largest consisting of four eight-way servers. That cluster alone represents a $1 million technology investment, including the cost of the application software running on it. Reducing performance overhead by 10% saves Overstock $100,000, Schwegman says. Likewise, his total relational database investment is over $5 million. To save 10% of that promises even more significant savings.

Overstock purchased the GoldenGate data-management tools two months ago. In the first phase of its project, it has been migrating from Oracle 9i to 10g, with GoldenGate handling the data transfers without any database down time.

The online retailer also has been building a Teradata data warehouse with Business Objects SA reporting software replacing the 30,000 lines of custom reporting code previously used by Overstock managers and analysts. The company started to assemble the data warehouse in April, a project that normally takes up to six months or more, but Schwegman expects to finish the work in less time. The system now has 25 early users, "executives and a set of data hounds, power users," Schwegman says. In another month, the total will be 250 users, Schwegman says.

When the data warehouse is finished, GoldenGate will continue moving data from the operational Oracle systems into the data warehouse "in near real time to provide reporting and analysis for our environment," Schwegman says.

Needing real-time analysis for customer-relationship management was one reason Overstock created its own in-house reporting system after it was founded in 1999. Now, as a maturing online retailer--Schwegman says Overstock is the No. 6 or 7 retail Web site in terms of the number of its unique visitors--Overstock must move away from homegrown systems to systems that march in step with its growth rather than impede it.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights