Big data pros need a variety of skills, but these three attributes matter most, says FICO's chief analytics officer.
Big Data Analytics Masters Degrees: 20 Top Programs
(click image for larger view and for slideshow)
It takes more than a Ph.D. -- or two or three -- to be a first-rate data scientist. An assortment of technical, business and people skills are required, a unique combination that partly explains the predicted shortage of data science professionals within a few years.
FICO, known for its analytics and decision-making products and of course its eponymous credit-scoring service, has a new infographic that summarizes the eight characteristics of a top-notch data scientist. You'll find it here.
According to Dr. Andrew Jennings, chief analytics officer at FICO and head of FICO Labs, three of these characteristics are most important, and every organization in the market for a data scientist should know what they are. In a phone interview with InformationWeek, Jennings revealed this holy trio.
1. Problem-Solving Skills
This may seem obvious, of course, because data science is all about solving problems. But a good data scientist must take the time to learn what problem needs to be solved, how the solution will deliver value, and how it'll be used and by whom.
Getting this right can be critical to success or failure, according to Jennings. "Here's a good example of that: In the fraud world, it's generally a needle-in-the-haystack problem," he said. "Think about credit card transactions or medical claims: the vast majority of transactions and claims are not fraudulent." As a result, the data scientist's fraud solution must work within a very small operational range. The ability to drill down and focus specifically on the problem at hand is a very important skill, Jennings added.
2. Communications Skills
"You've got to be able to talk to people who don't have Ph.D.'s in the thing that you have a Ph.D. in," said Jennings. "If all you can talk about posterior probability distributions, you have a problem."
In other words, you may be brilliant in your rarefied field, but you're not going to be a really good data scientist if you can't communicate with the common folk. However, if you're part of a data science team and have so-so people skills, someone in your group could play the role of great communicator. "You're going to need somebody on your team to communicate to business people what it is you have, and ask the questions of what's important," Jennings said. "You must convince them that what you've done is viable and will deliver business value."
"Too often we have our favorites in life -- our favorite anything," said Jennings. "You can have your favorite technique, your favorite approach. People [tend to] shoehorn any problem that comes along into whatever approach they are most comfortable with."
That's why being an open-minded problem solver is the third characteristic that defines a good data scientist. This skill is particularly significant because a data scientist often applies his or her knowledge to multiple industries, such as banking, health care and retail. "One of the great things that a data scientist does is to understand what they don't understand," Jennings explained. "It's the knowing-the-unknown kind of thing. In any problem, the main expertise is very important, which is why communication is important."
Many big data companies today are developing software tools that automate tasks performed by data scientists. But despite the ongoing democratization of data science, Jennings says we'll always have a need for highly educated data gurus.
One very significant reason: An organization stills need an expert to determine whether a particular solution or model will be reliable in an operational environment. "Do the types of things that I'm seeing in the model actually make sense? How confident am I of these data sources being stable? You can go on and on," said Jennings. "You need to be able to look at what was created, understand it and know whether it's going to work."
The big data market is not just about technologies and platforms -- it's about creating new opportunities and solving problems. The Big Data Conference provides three days of comprehensive content for business and technology professionals seeking to capitalize on the boom in data volume, variety and velocity. The Big Data Conference happens in Chicago, Oct. 21-23.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business wonít wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.