Olly Downs has spent his career parsing data. He started in 2004 at INRIX, the first spinoff from Microsoft Research, which used crowdsourced location data to do things like predict traffic flows. Downs has been involved in a series of startups since then, including Atigeo, Mindset Media and Pelago.
Now he's at Globys, a 16-year-old firm focused on applying analytics to corporate marketing. Downs looks at ways for mobile phone operators to walk the fine line between effective and creepy personalized marketing. "It's rocks and gems," he says. Doing it right can be very positive for companies, but getting it wrong damages their reputations.
Downs also understands the challenges private companies face in hiring data scientists. He spoke with InformationWeek via phone in September about his own work at Globys.
Name: Olly Downs
Title: Senior VP of data sciences, Globys
How long at current job: Since June 2012
Career accomplishment I'm most proud of: The best work you do ends up being retrospectively simple. When we were beginning to start INRIX, in crowdsourcing location information from vehicles, there was an accepted way that governed how many vehicles you needed to get an accurate measurement of traffic flow. Some of my work was to show that you needed more than an order of magnitude fewer vehicles on the road to come up with the same information.
The second is that in part of my Ph.D. thesis, I explored the notion of computing in a way that is quantum mechanical but not a quantum computer. It is in part the foundation on which D-Wave Systems has been able to build the kind of quantum computer my thesis envisaged. So I'm hoping to be involved in a fundamental change in computing.
Decision I wish I could do over: A couple of failures I wish I had been able to avoid. One of them was a business where we focused on developing a novel technology and put all our effort into the tech, but didn't focus on the pragmatic use cases customers would have for it. I'm not going to name it; it was in the natural language processing space.
Most important career influencer: In recent years it has been Jack Breese, one of the original machine learning guys [who] formed the heart of data science at Microsoft Research. He has continued to be a great reviewer and critic for me of work I've been doing, and I've really valued getting his unvarnished feedback. Others wouldn't have given me such brutally honest comments. The other great mentor of mine has been David Heckerman, also at Microsoft Research.
Top initiatives:Our top initiatives are related to driving value with prepaid subscribers -- identifying and answering, through experiment and algorithm, this property of social contagion we call homophily, and also applying a new set of model approaches to representation of customer data.
It's conventional today in BI and analytics to view a customer as a row of attributes that update over some period of time. Our representation is a longitudinal view of every event, transaction, purchase, etc. they've made, and discovering patterns in them. It opened up for us an area of machine learning and data mining that has been underemployed in the last 10 to 15 years: dynamic Bayesian networks and state space modeling.
Biggest misconception about big data: That big data is about storage technology. There's a lot of emphasis on the infrastructure, which is important, but not a lot of emphasis yet on good outcomes. There are a lot of people with great large-scale Hadoop deployments wondering what they're going to do with them.
Most disruptive force in my industry: Apps that do voice and messaging over Wi-Fi. It's a major disruptor.
Most promising technology: I'm very much enamored with the efforts that have gone on in developing Python, Interactive Python and the IPython Notebook. Python as a tool allows us to bridge between research and algorithm and model prototyping and visualization of the type people do in R, Mathlab, SPSS and so on. It's kind of a game changer.
Reasons big data projects go wrong: Essentially that the stakeholders are wrong. They believe that big data is the IT procurement guys deploying a system. You need an end-to-end use case that has ROI. That can mean having end users [who] don't work well together collaborate closely.
The big data market is not just about technologies and platforms -- it's about creating new opportunities and solving problems. The Big Data Conference provides three days of comprehensive content for business and technology professionals seeking to capitalize on the boom in data volume, variety and velocity. The Big Data Conference happens in Chicago, Oct. 22-23.