Overstock Accelerates with Big Data Platform - InformationWeek
Data Management
08:00 AM
Connect Directly

Overstock Accelerates with Big Data Platform

Overstock is one of the granddaddies of Internet commerce with 20 years worth of customer data to support its marketing analytics. Here's how it increased the speed of models and analysis by 5x.

Getting the attention of consumers is no small feat these days. They are bombarded with advertisements on television, through their mobile phones, and on the Internet. You may see ads from your favorite online retailers follow you from Google to news sites to Facebook.

Engaging with the right consumer at the right time can make a big difference for online retailers, and these ads are an important means of doing it. But there's plenty of competition out there.

(Image: TheDigitalArtist/Pixabay)

(Image: TheDigitalArtist/Pixabay)

If you want to use these kinds of ads to gain consumer attention today, you probably have to act fast. It's not like the Internet commerce of 20 years ago when online retailer Overstock came onto the scene, buying and then selling the inventory of failed online retailers.

Overstock.com is really one of the original online ecommerce businesses. This online retailer was founded the same year as Netflix (the company that started by sending out DVDs via US mail) and 3 years after Amazon.com made its debut selling print books on the Internet and shipping them to your doorstep. It was a time when the World Wide Web -- we called it that back then -- was just getting started as a mainstream network used by consumers for many things, including consumer purchases. After 20 years in business, Overstock has amassed huge volumes of data.

Overstock's business model has evolved over the years beyond discount and liquidation to include sales of new merchandise and hand-crafted merchandise from developing countries. The site sells everything from furniture to apparel to electronics.

Overstock has always been a bit of a trailblazer. For instance, back in 2014 it was among the first big retailers to accept bitcoin for payment. So it shouldn't be a surprise that Overstock would work with newer technologies to get an advantage when it comes to advertising and marketing itself to consumers.

Like many other businesses, Overstock uses SEM, also known as search engine marketing, or paid search, to place advertisements on the familiar sites that consumers use -- from Google Ads to Facebook. If you search for "sectional couch," for instance, an ad for that type of furniture at Overstock.com may very well appear at the top of your search results on Google. And then later, Facebook will show you an ad for sectional couches at Overstock.

Chris Robison, Overstock's lead data scientist for marketing, has overseen the pieces of technology that contribute to the company's effort to bid for ads across various advertising platforms. Among the technologies in place to perform the work were Teradata, Python, Jupyter Notebooks, Apache Spark, Scala, and a large Hadoop cluster. There were many of today's leading-edge technologies in place, Robison told InformationWeek in an interview. But those technologies were siloed.

The problem was gigantic. How do you know when a customer is most likely to purchase? Robison's team wanted to assign scores to customers based on their likelihood of purchasing, and was working to better understand customer browsing and purchasing behavior.

Robison's small team of data scientists -- himself and three others -- had to oversee not just the data science but also the data engineering -- making sure all the technology pieces worked together, which was a time-consuming process.

"We want all the data, all the time, and all in near real time in order to make smart business decisions. Instead of focusing on our critical data problems and models, our data scientists found themselves dealing with the complexities of managing infrastructure," Robison said in a statement.

Plus, communications in this scenario took time, too. For instance, a team member who wanted to get a data set out of one of the data warehouses would need to go to the team, fill out a ticket, and move the data to the environment where it would be used.

"We wanted to speed up the iteration cycle," Robison said in an interview with InformationWeek. "Pushing out new features weeks before we would have been able to -- that can add a significant impact to the bottom line."

They knew they needed to unify the technology pieces for a unified view and to speed up the processes. With the need for speed in mind, Overstock opted to do it with Databricks Unified Analytics Platform. Databricks is a company founded by the creators of Apache Spark, a streaming analytics engine for big data. Databricks first service as a company was a hosted version of Spark. Databricks Unified Analytics Platform provides a hosted, unified platform that includes the technology Overstock needed for its SEM bidding work.

Robison will be recounting more about the story of the Overstock's move during a keynote address on June 6 at Spark + AI Summit in San Francisco.

Robison's team used Spark to filter out bot traffic on the site, and then determine patterns of purchase behavior. For instance, customers were more likely to browse during the day while at work, but then make their actual purchases in the evening when they were back at home.

The deployment of a unified platform has decreased the cost of moving models to production by nearly 50% and has increased the time to stand up new models by 5x, according to Robison.  The team is able to spin up and down clusters through self-service, cluster management, which has also accelerated the process.

Overstock is now applying the platform to stem fraud in its ecommerce operation, Robison said -- individuals making purchases using stolen credit cards and identities.

"The unified platform allows each project to learn from the projects that came before," he said.

Jessica Davis has spent a career covering the intersection of business and technology at titles including IDG's Infoworld, Ziff Davis Enterprise's eWeek and Channel Insider, and Penton Technology's MSPmentor. She's passionate about the practical use of business intelligence, ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
2018 State of the Cloud
2018 State of the Cloud
Cloud adoption is growing, but how are organizations taking advantage of it? Interop ITX and InformationWeek surveyed technology decision-makers to find out, read this report to discover what they had to say!
AI & Machine Learning: An Enterprise Guide
James M. Connolly, Executive Managing Editor, InformationWeekEditor in Chief,  9/27/2018
How to Retain Your Best IT Workers
John Edwards, Technology Journalist & Author,  9/26/2018
10 Highest-Paying IT Job Skills
Cynthia Harvey, Contributor, NetworkComputing,  9/12/2018
Register for InformationWeek Newsletters
Current Issue
The Next Generation of IT Support
The workforce is changing as businesses become global and technology erodes geographical and physical barriers.IT organizations are critical to enabling this transition and can utilize next-generation tools and strategies to provide world-class support regardless of location, platform or device
White Papers
Twitter Feed
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.
Sponsored Video
Flash Poll