Data Scientist: Human Today, Software Tomorrow - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IT Leadership
09:21 AM
Jeff Bertolucci
Jeff Bertolucci

Data Scientist: Human Today, Software Tomorrow

Automation will lessen the need for the elusive, talented, and expensive human data scientist -- and that's a good thing, says Narrative Science cofounder.

Data scientists, you may look sexy today, but automation will win our hearts in the end.

9 CIO Tech Priorities For 2015
9 CIO Tech Priorities For 2015
(Click image for larger view and slideshow.)

So says Narrative Science cofounder and chief scientist Kris Hammond, who predicts that 2015 will bring less investment in "human-powered data science" and more in automated software tools that mine big data to unlock value insights.

In an interview with InformationWeek, Hammond, who doubles as a computer science professor at Northwestern University, expounded on a variety of data-focused topics, including what he sees as the end of the "data-hoarding era" in the enterprise and the emergence of artificial intelligence (sans the killer robots) in mainstream life.

Narrative Science provides one type of this kind of automation. The company makes natural language-generation software, most notably Quill, which examines big data feeds, extracts information relevant to the user, and generates -- or writes -- reports in human language, typically English. Its competitors include Automated Insights, whose Wordsmith platform also generates written reports -- sometimes thousands per second – including online sports and finance stories you may have read.

[Not sold on automation? See How To Build A Data-Driven Dream Team.]

Hammond has nothing against data scientists -- he's one himself. But he believes the explosive growth of big data technologies will require a more automated approach to information analysis. Besides, data scientists are expensive to employ, and there simply aren't enough of them to go around, he claims.

"They're never going to be able to scale into the kind of reporting that is absolutely essential for organizations now," said Hammond.

(Source: Geralt/Pixabay)
(Source: Geralt/Pixabay)

Data scientists are often called upon to do relatively mundane tasks that don't put their data analysis skills to good use. One example is "being asked to spend my day looking at sales figures for 10,000 stores and write reports based upon those sales figures," said Hammond.

He added: "If I were asked to do that, I could do it. It actually requires some of my skills, but it would kill me. It would drive me mad because, in fact, that's not me using my skills at the high end of my skill set."

This dichotomy between the mundane and magnificent is commonplace in today's data science teams. And as we pull in increasing volumes of data, such as the data streaming in from billions of Internet-connected devices, the need for automation becomes more apparent.

"Data scientists, because they're so few of them and they're so expensive, and [because] they want to work on hard and interesting problems, are not going to help us get to the nuts and bolts of understanding data, or even communicate the basics of what's going on in the world," Hammond said.

A greater reliance on automated analysis might help enterprises extract value from their swelling big data stockpiles, which increasingly measure in the petabytes. "We don't all have to become data scientists in order to work with the machine," Hammond said. "The machine needs to become more human and work with us."

Attend Interop Las Vegas, the leading independent technology conference and expo series designed to inspire, inform, and connect the world's IT community. In 2015, look for all new programs, networking opportunities, and classes that will help you set your organization’s IT action plan. It happens April 27 to May 1. Register with Discount Code MPOIWK for $200 off Total Access & Conference Passes.

Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, The Saturday Evening Post, and InformationWeek. View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
D. Henschen
D. Henschen,
User Rank: Author
1/22/2015 | 8:55:09 AM
Data analysts using the least of their skills
I've heard the complaint about data analysts wasting time on mundane reporting tasks for years. As I report in this slide show on CIO priorities for 2015, GameStop had the same problem. "We were getting more questions than we had time to respond to, so our analytics team was turning into a reporting team," said Jason Kappel, GameStop's director, CRM, at the recent NRF (National Retail Federation) Big Show in New York.

Using the combination of Tableau Software for dashboard-based reporting and Alteryx for deeper analytics, GameStop was able to set up more of a self-service environment for business users. "We took Monday-morning reporting routines that were taking half a day and brought them down to about 5 minutes," said Kappel.

This common problem has been the impetus of the trend toward self-service reporting, which has led to fast growth for business-user-friendly BI tools such as Tableau, Qlik, and many imitators who have since added data-exploration and data-visualization products.
User Rank: Ninja
1/21/2015 | 10:07:20 AM
While automation will undoubtedly help organizations with some of the tedious tasks currently performed by data scientists, the evolution will only afford more time for specialists to utilize their talents to continuously come up with creative new ways to leverage data.  We need to keep in perspective that the big data analysis space is still maturing. Most organizations still have quite a ways to go before they are truly effective at utilizing data. It is something that comes with time, experience and honestly learning from failures/mishaps. 

Peter Fretty, IDG blogger working on behalf of SAS

How GIS Data Can Help Fix Vaccine Distribution
Jessica Davis, Senior Editor, Enterprise Apps,  2/17/2021
Graph-Based AI Enters the Enterprise Mainstream
James Kobielus, Tech Analyst, Consultant and Author,  2/16/2021
11 Ways DevOps Is Evolving
Lisa Morgan, Freelance Writer,  2/18/2021
White Papers
Register for InformationWeek Newsletters
The State of Cloud Computing - Fall 2020
The State of Cloud Computing - Fall 2020
Download this report to compare how cloud usage and spending patterns have changed in 2020, and how respondents think they'll evolve over the next two years.
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll