Big Data // Big Data Analytics
News
9/24/2012
03:33 PM
Connect Directly
Google+
RSS
E-Mail
50%
50%
Repost This

Online Retailer Uses DNA Research To Connect With Customers

Home furnishings startup Wayfair applies principles of protein analysis to recommend products.

Big Data Talent War: 10 Analytics Job Trends
Big Data Talent War: 10 Analytics Job Trends
(click image for larger view and for slideshow)

When you think of big data and its impact on ecommerce, words such as Hadoop, NoSQL, and predictive modeling might spring to mind. DNA research? Not so much. But Wayfair, an online retailer of home furnishings, is applying research from the scientific discipline of protein analysis to a more pragmatic problem: How to recommend relevant products to shoppers on its site.

In late 2011, the company was searching for a better customer recommendation system. "We found that the family of techniques most well known in that area just didn't work with our data," said Ben Clark, Wayfair's director of search and recommendations, in a phone interview with InformationWeek.

Clark and his team of data scientists scoured academic and industry research papers for innovative approaches, or new ways to look for patterns in data. "One of my guys had a particularly inspired flight of intuition in connecting things that don't--on their face--look well connected at all," said Clark.

That insight came in the form of a 1997 research paper from Dutch bioinformatician Stijn van Dongen. Bioinformatics is a branch of biological science that explores ways to store, analyze, and retrieve biological data. Clark's team began using the clustering techniques that van Dongen had used to analyze proteins, as well as a software toolkit that the Dutch researcher had written and provided a free license to.

[ Related video: Startup Richrelevance: A Next Gen Recommendation Engine. ]

"Sure enough, when we ran our data through it, we could tell immediately that the results looked intuitively good. Then we put it up on our site, and people seem to like it," Clark said. A February 2012 blog post by Clark summarizes how Wayfair used van Dongen's techniques to build its recommendation engine. The post includes a series of four photos, each showing a series of lines and dots that represent clusters of proteins and their connections with one another.

"I don't know what that represents in the protein world, but in my world, it represents a connection between two items," said Clark.

The connections, for instance, could carry several different definitions when applied to an ecommerce site, such as two people who use the same item, or one person who bought two items in the same shopping cart. The thicker the line, the stronger the connection between two items.

Wayfair needed a way to weed out the less relevant connections. Customers "are surfing around our site, and we're trying to make useful lists of things they might want to buy," Clark said. "If we just say that everything is connected, that gives us too much data."

The Dutch researcher's mathematical process allowed Wayfair to remove the "wispy, tenuous connections that aren't as strong," and uncover clusters of things with strong enough connections to be useful to its customers, said Clark.

It's difficult to estimate the economic impact of the new technique, Clark said. However, a similar approach that Wayfair used for another recommendation system has increased customer click-through rate by 18%. "From where I sit in this business, that's a huge increase," said Clark.

It's unclear if van Dongen's clustering techniques and software toolkit would work for other ecommerce sites as well. Clark points to a quote in his blog post from Data Analysis with Open Source Tools, a book by software project consultant Philipp Janert, who states that only spam filtering, credit card fraud detection, and credit scoring applications have been effective across a wide range of usage scenarios.

As for customer recommendation engines: "The approaches that work tend to be quite ad hoc. I think it's still a very difficult problem to solve these things in a general way," said Clark.

See the future of business technology at Interop New York, Oct. 1-5. It's the best place to learn about next-generation technologies including cloud computing, BYOD, big data, and virtualization. Register by Friday, Sept. 28, to save 40% off on Interop New York Conference Passes with code WEYLBQNY09.

Comment  | 
Print  | 
More Insights
InformationWeek Elite 100
InformationWeek Elite 100
Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.