Big Data // Big Data Analytics
News
12/30/2013
09:06 PM
Connect Directly
Google+
RSS
E-Mail
50%
50%

How To Build A Successful Data Science Team

Don't try to find one superhuman who does it all. You need three experts: business analyst, machine learning expert, and data engineer, says Lithium Technologies chief scientist.

IBM Predicts Next 5 Life-Changing Tech Innovations
IBM Predicts Next 5 Life-Changing Tech Innovations
(click image for larger view)

Is there really a data scientist shortage, or are organizations simply trying too hard to recruit a unicorn, a jack-of-all-trades who possesses both advanced technical and business acumen? 

If the unicorn hypothesis is true, it would explain why the scarcity of data scientists is expected to worsen in the coming years.

The solution isn't difficult, some industry insiders believe, but rather one that might prove unpopular with cost-conscious organizations unable or unwilling to hire a data science team rather than a single data scientist.

Dr. Michael Wu is chief scientist of Lithium Technologies, a San Francisco-based company that sells social customer experience management software to businesses. Not surprisingly, Lithium captures a lot of data on consumer behavior, and part of Wu's job is to analyze that information and predict customer actions on an aggregate level.

[ Want more on the data scientist phenomenon? Read Data Scientist: The Sexiest Job No One Has. ] 

Wu believes term data scientist is tossed around loosely these days, so much so that it's creating a bit of confusion in the tech industry.

"What the industry calls a 'data scientist' now is really several different roles," said Wu in a phone interview with InformationWeek. "When people say there's a shortage of data scientists, (they mean) there is a shortage of people with all of these different skills."

Wu subdivides the data scientist role into three distinct jobs, each requiring a different skill set: business analyst, machine learning expert, and data engineer.

"You need these three groups of people to work together in order to inform the business decision-makers," said Wu.  

The role of business analyst existed long before the terms "big data" or "data scientist" were in vogue. This person works with front-end tools, meaning those closest to the organization's core business or function, such as Microsoft Excel, Tableau Software's visualization tools, or QlikTech's QlikView BI apps. A business analyst might also have sufficient programming skills to code up dashboards, and have some familiarity with SQL and NoSQL.

"They analyze business-level data and try to produce actionable insights," said Wu. "A lot of companies have (these) people."

The recent hype surrounding big data, however, has led many business analysts to rebrand themselves as data scientists even though they are not, according to Wu's definition.

"It automatically gives them a little boost in their salary," Wu said, chuckling.

The second data science role is that of machine-learning expert, a statistics-minded person who builds data models and makes sure the information they provide is accurate, easy to understand, and unbiased.

"These are the people who develop algorithms and crunch numbers," said Wu. "They are interested in building models that predict something."

A machine-learning expert, for instance, might develop algorithms that predict consumer sentiment or estimate a person's influence in a particular industry.

"There are even machine-learning algorithms that look at images and tag them automatically, or look at videos and try to understand what the video is about," said Wu. 

Like the business analyst, the machine-learning expert isn't a new profession, but rather one that's existed "in the last 30 years or so," Wu estimated.

The third key job, data engineer, is "the bottom layer, the foundation," said Wu. "They are the ones who play with Hadoop, MapReduce, HBase, Cassandra. These are people interested in capturing, storing, and processing this data… so that the algorithm people can build models and derive insights from it."

However, it's nearly impossible to find one person -- that data scientist unicorn -- who excels in each of these three areas, Wu said. And that's why organizations must focus instead on building a data science team.

Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, The Saturday Evening Post, and InformationWeek.

You can use distributed databases without putting your company's crown jewels at risk. Here's how. Also in the Data Scatter issue of InformationWeek: A wild-card team member with a different skill set can help provide an outside perspective that might turn big data into business innovation. (Free registration required.)

Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
Page 1 / 2   >   >>
Whoopty
50%
50%
Whoopty,
User Rank: Ninja
12/31/2013 | 6:33:40 AM
Good luck
Good luck to those that find themselves needing to convince higher ups that they need to take on three people to tackle one particular function!
DanielN381
50%
50%
DanielN381,
User Rank: Apprentice
12/31/2013 | 9:04:45 AM
So much to learn
Interesting post, but in this day one skill is not enough for you to be a scientist. You must know how to collect data, process data, build a model, and present it in a meaningful way. 
WKash
50%
50%
WKash,
User Rank: Author
12/31/2013 | 11:15:42 AM
Super Team
One would not expect to have a single person trying to master all the dimensions of security (software, infrastructure, risk management, etc.).  It takes a team.  I suspect the roles and dimension of big data are still so new, it's easy to see how organization's are latching onto the notion of a data scientist. It's a convenient way to embody the disciplines. There may be a few supermen/superwomen out there.  But most likely, the organizations that will profit from data sciences will be those with a super team.

 
RobPreston
100%
0%
RobPreston,
User Rank: Author
12/31/2013 | 11:32:29 AM
Re: So much to learn
We as a society tend to throw around the words "scientist" and "science" too liberally. Colleges award bachelor of science degrees in such non-scientific fields as management and philosophy. Urban planners fancy themselves as social scientists. And don't get me started on political science--double-speak is far more of an art.
shamika
50%
50%
shamika,
User Rank: Apprentice
12/31/2013 | 11:50:07 AM
Re: So much to learn
This is an interesting article and one of my favorite subjects. This gives the ability to understand the data sets and to perform detailed analysis on the same.

 
shamika
50%
50%
shamika,
User Rank: Apprentice
12/31/2013 | 11:55:44 AM
Re: So much to learn
@DanielN381 yes you are correct and having all these will help you to master on data. 
shamika
50%
50%
shamika,
User Rank: Apprentice
12/31/2013 | 12:09:58 PM
Re: So much to learn
Developing algorithms and crunching numbers is also an important aspect when it comes to data scientists.
I give
50%
50%
I give,
User Rank: Apprentice
12/31/2013 | 7:43:48 PM
Re: So much to learn
Thanks, Rob.  I could go on and on about this.  A "Data" Scientist?  What would we expect an "Air" Scientist to know and do? How about a "Dirt" Scientist", "Atom" Scientist, "Wood" Scientist, "Word" Scientist, "Fur" Scientist (not to be confused with a "Hair" Scientist?  Which came first Data Science, or the First Data Scientist?    Heck, I doubt that there is an agreed to definition of Data, not among Data Scientists anyway.

IMHO, the very idea that a number cruncher is expected to develop consumer insights is so naive that it does littel more than show the tendency for everyone in every discipline whatsoever to assume that someone else does the work, and the "user" need know nothing, or do nothing other than be a "manager" who hires a bunch of other managers all the way down to the single person who does everything, which is then fed up the food chain to The Manager.  

So the best way to succeed is by knowing nothing, but by getting to manage the most managers you are capable of.  Just be sure that the lowest level manager has a Data Scientist working for them.  That low level manager can always replace the Data Scientist if the individual is not up to carrying the Company.
cbabcock
50%
50%
cbabcock,
User Rank: Strategist
1/2/2014 | 4:21:56 PM
Specialties matter but world needs generalists
The three roles captured in the term "data scientist" perhaps should be handled by three people of different skill sets. But in an ideal world, these three people would rotate jobs within the trio every three months until each could take a stab at performing the role of the others. The person who consistently performs best in all three roles should be named the team leader. Sounds crazy, but the world will always need generalists on top of the specialists.
pcalento011
100%
0%
pcalento011,
User Rank: Apprentice
1/4/2014 | 11:48:17 PM
Skills important, but context even more so.
One of the items to address re: data science isn't so much the skills or team, but the understanding of what the data all means. This is not merely an academic exercise. This is business. Patterns need to be mapped and measured into a relevance scale. Without this added step, Big Data will be just that ... more and larger data. Context. Context. Context.
Page 1 / 2   >   >>
6 Tools to Protect Big Data
6 Tools to Protect Big Data
Most IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.