Big Data. Big Decisions
InformationWeek
Special Coverage Series

Commentary

Jennifer Priestley

Jennifer Priestley



Big Data Education: 3 Steps Universities Must Take

How can universities help meet the growing demand for data scientists? Consider this advice from a professor working in the trenches with tomorrow's analytics pros.

 Big Data Talent War: 7 Ways To Win
Big Data Talent War: 7 Ways To Win
(click image for larger view and for slideshow)

By now, we all know that the "sexiest job of the 21st century" is the data scientist. A scan of articles and blogs describing data scientists and their raw material -- big data -- reveals several "sexy" themes. First, data is ubiquitous, big and coming at us with increasing velocity. Second, traditional tools that have been used to extract and analyze 20th century data don't work with big data. Third, incredibly few people have the skills necessary to translate this tsunami of data into meaningful information -- making them the hotshots in the job market.

More Insights

Webcasts

More >>

White Papers

More >>

Reports

More >>

By 2018, McKinsey estimates that there will be a talent gap for deep analytical talent of almost 200,000 people. No doubt the data scientists' dance cards will be full.

So, with all of this demand, combined with a high national unemployment rate, university students are beating down the doors for acceptance into the data science programs on campus -- right? Sadly, the answer is "no" -- but not because students are not interested in taking courses aligned with data science. They want to be job hotshots. The issue is that no university in the country has a program in data science.

[ Quantitative skills aren't the only criteria for data science success. Wanted: Qualified Data Scientists, People Skills A Plus. ]

We understand the general reasons why universities have not pivoted to better meet the demands of the market in this space -- ivory tower mentalities, few academics have the experience or the skill set to teach big data analytics and lack of actual big data for the classroom.

As an academic, and a former practicing statistician/consultant, I believe universities have to address the challenge and partner with the private and public sectors to close this talent gap. Specifically, I recommend three considerations for universities in the area of data science:

1. Data science should not be an undergraduate degree. It's too broad, too nuanced and too demanding for an 18-year-old student to understand. Undergraduate students who are interested in eventually pursuing data science should study mathematics or computer science and take elective courses in some content area like finance, biology or sociology. During their undergraduate degree studies, students should be developing the absorptive capacity necessary to develop the deep and wide skills required to be competitive in this space.

2. Any graduate degree in data science must integrate the disciplines of mathematics, statistics and computer science. This is a particularly daunting challenge for many universities, as these disciplines typically are housed in different departments or even different colleges. Data science is inherently interdisciplinary. Any Master's or doctoral degree would necessarily include:

a. A foundation in computational mathematics, such as matrix algebra, combinatorics and graph theory. This is critical because the other skills cannot be developed without some numerate orientation.

b. Programming, namely strong analytically oriented programming such as SAS and R as well as strong language-oriented programming such as C++, Java, Hadoop or Python. Some coursework in high performance analytics is particularly valuable.

c. Statistical analysis, model development and data visualizations. These skills are not going away; they are evolving.

d. A working knowledge of a content area. After all, data science has power in application, not in theory.

e. A practicum/work experience component. This cannot be overemphasized. If you try to teach someone to swim through lessons from a textbook, they will drown when thrown into a pool. Graduate students studying data science need practical experience working with complex, unstructured data. While we try to create realistic experiences in the classroom, ultimately, they are not real.

3. Research. This is a nascent, but expanding field of study. New problems emerge every day. Data science, much like medicine, lends itself to applied (versus traditionally theoretical) research. Conferences provide great opportunities for graduate students to present white papers on new code, develop creative solutions to solving new problems and even give name and structure to emerging issues. These are all part of the fertile field of data science research and scholarship.

Some companies like EMC/Greenplum and IBM are bypassing universities altogether and developing data scientists in house. This is a reasonable short-term response, given the absence of programs of study. However, if the talent gap is to be closed, over the long term universities are going to have to rethink how they approach the science of analytics.

From SDN to network overlays, emerging technologies promise to reshape the data center for the age of virtualization. Also in the new, all-digital The Virtual Network issue of Network Computing: Open Compute rethinks server design. (Free registration required.)

Dr. Jennifer Priestley is an Associate Professor of Statistics at Kennesaw State University, where she is the Director of the Center for Statistics and Analytical Services. She also oversees the undergraduate curriculum in Statistics, and was recognized by the SAS Institute as the 2012 Distinguished Statistics Professor of the Year. She served as the Co-Chair of the 2012 National Analytics Conference in Las Vegas, NV.



Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.