Building a data science team? As a first step, swap in subject-matter experts from the business to help you formulate the right questions. Next, consider hiring physicists and music majors as well as statisticians and computer scientists.
These are the key, and in some ways unorthodox, data-science hiring pointers offered by Josh Sullivan, VP of the Strategic Innovation Group at management consulting and technology advisory firm Booz Allen Hamilton. Sullivan's group is responsible for helping clients with data analytics initiatives, and in that work Sullivan said he sees many companies make the same mistake.
"Most companies only think to hire computer scientists because they think big data is a technology problem; but it's not," Sullivan told InformationWeek in an interview. "The first thing I ask people is, 'what question are you going to ask of your data,' not 'how are you going to code it.' To think about the question, you need people who can be creative and who are curious."
Booz Allen's first step in building data science teams is to ensure a mix of math and statistics people, computer science people and domain experts from within the business. The domain experts are crucial because they ensure that the big data analytics have business value and will help drive decisions.
[ Want more on why diversity wins? Watch this video. Kaggle Crowdsources The Data Scientist Problem. ]
The one caution with domain experts, though, is to rotate them in and out of the data science team. The idea is to expose the data science team to multiple lines of business while also sending domain experts back to the business, where they'll become evangelists for the use of data-driven methods.
"After six to nine months, you really want a fresh view of the business problems, otherwise things get myopic and you go too narrow and deep in one part of the business," Sullivan said. "Every line of business has its own BI dashboards and analytics and it's kind of like trying to read a map through a tube."
Too many companies are structured with specialized analytics teams for specific departments and lines of business, said Sullivan. Such teams often fail to consider the context of the overall business. The approach also fosters a "data-hugger" mentality whereby departments hoard their data and put up obstacles to wider usage of information.
Another questionable practice is hoarding of analytics expertise within an R&D team with little exposure to the rest of the business.
As I reported last year, Dow Chemical has had huge success by having data experts work side-by-side with subject-matter experts. New domain-specific cost models saved the company billions on freight and raw-materials spending alone, and the data experts have since turned to other areas of the business.
Looking beyond statisticians and computer scientists, Sullivan said Booz Allen has had success bringing physicists and music majors onto data science teams. Both groups tend to bring curiosity and experimentation into play, Sullivan said, with physicists "exuding the scientific method" -- moving from conjecture to hypothesis to testing -- and music majors offering "amazing creativity and quantitative skills."
That music-major advice might sound odd, but creativity comes into play when data science teams consider mashing together data in much the same way that composers might experiment with combinations of instruments. The combination of skills has worked out in several recent projects seeded by Booz Allen personnel, according to Sullivan. For example, the consultancy worked with a pharmaceutical company and mashed up adverse-drug-reaction data, social media data, research notes, lab data and molecular data. It was the first time all these sources were combined, and it helped the company better prioritize expensive drug research, according to Sullivan.
In another recent project, Booz Allen helped an airline prove big data value by taking data on schedules, routes, fares, destinations and historical passenger loads and combining it with sports schedules, convention dates, school seasonality, people movement by age segment, and social media data.
"The airline had lots of BI dashboards and PDF reports about each of these areas separately, but they had never combined all of that information and let machines go to work," Sullivan said. "The results helped the airline make adjustments to flight schedules and fares that have resulted in tens of millions of dollars in additional revenue."
The core analysis took three people on a data science team seven weeks to complete, and it seems to prove Sullivan's point that "letting smart people go off and stitch together lots on information can really pay off."
The benefits of people diversity have also been proven by big-data crowdsourcing company Kaggle, which has seen competitors, including astronomers, hedge fund quants, statisticians, economists, mathematicians and others, blow away analytic benchmarks achieved by internal teams with less diversity and used to doing things the same old way.
Make sure your end users recognize and know how to avoid the latest Web-based attacks. Also in the new, all-digital 10 Web-Based Attacks Targeting Your End Users special issue of Dark Reading: Refresh often for effective security training. (Free registration required.)