For the last few years, major universities around the country have debated the issue of "data science" as a unique discipline. Specifically, the debate has centered around two positions: Either 1) universities just continue to teach the foundational topics of mathematics, statistics, and computer programming as they always have, and then leave the graduates to learn the rest on the job; or 2) data science is emerging as a unique discipline that deserves its own unique curriculum, and students should be allowed to obtain degrees in data science (and DS-esque areas like predictive analytics).
A survey of major universities around the country seems to indicate that option B appears to be winning. And that is a good thing.
After all, universities are really the farm system for the major leagues of data science. There is a talent gap, and it's good to see that we (the academic community) have realized that we cannot contribute to this talent pool in any meaningful way by continuing to teach foundational subjects like mathematics and statistics the way they have always been taught. Universities appear to be pivoting to meet the demands of the market.
[ Five big data roles are emerging in the enterprise. Read How To Build An Analytics A-Team. ]
But, in many instances, the data science discipline is evolving within the university business school. And this is troubling.
Data science is, pedagogically, much more like English than marketing. Why? Learning how to write well and use proper grammar is a core set of foundational skills that all college graduates must develop. We don't teach English in the college of engineering and in the college of business and in the college of science. We send everyone in the university to the English department. Then, students take these nascent writing skills and apply them in their engineering, business, or science courses. Good engineers and scientists must write well -- whether they want to or not.
I would argue that today, the ability to work with data is becoming another core foundational skill that graduates of four-year accredited universities must have -- whether they want to or not. The rationale is pretty straightforward and simple: Data is ubiquitous to all sectors of the economy. Whether you are engaged in engineering, marketing, chemistry, finance, psychology, or political science, you will have to understand the basics of translating raw data into information to support the decision-making process -- yours or someone else's.
This is not to say that all students want to become computer programmers or statisticians. They don't. But they need to know some of the basics -- just as not all students want to become professional writers, but they have to know how to write.
I like Vincent Granville's recent post regarding "vertical" data scientists, which he refers to as "fake," versus "horizontal" data scientists. According to Granville, "Vertical data scientists are the by-product of our rigid university system which trains people to become either a computer scientist, a statistician, an operations research or an MBA guy." He describes "horizontal" data scientists as such: "They combine vision with technical knowledge." I would add to this that they have some area of subject matter expertise -- they understand how to apply basic ETL (extract, transport, load) functions, programming, visualization and modeling skills to some "content domain." Again, this could be engineering, marketing, chemistry, finance, psychology, or political science.
And therein lies the formula for universities.
The discipline of data science should sit outside of the traditional siloed towers of universities. Embedding data science in the business school or in any content area is problematic for two primary reasons:
First, it is resource inefficient, like teaching English or math within each school on campus. Data science is a combination of mathematics, statistics, computer programming, and some area of application. Let the business school become an area of application of analytics -- because that is what it is.
Second, the interdisciplinary classroom allows for exchanges of ideas and solutions not seen in a narrowly siloed classroom. When students studying economics, chemistry, physics, political science, sociology, and finance sit side-by-side in an analytics course, the students will offer alternative perspectives to problem solving and issue resolution, which are, in many instances, far more instructive and valuable for the other students than anything the professor could have planned. Students then take their learnings from their fellow students in the other disciplines back to their home departments. In our own university, where we have these interdisciplinary classrooms in applied analytics, we have seen, for example, finance majors using risk modeling approaches to solve nursing problems related to likelihood of patient re-admittance, and actuarial students using survival analysis concepts from epidemiology to solve problems related to insurance underwriting.
Michael Rappa, director of the Institute for Advanced Analytics at NC State, has figured this out. The institute is a template for interdisciplinary success, where subject matter experts from departments across campus (and from outside of academia) come to the institute to teach for a specified number of weeks in their areas of expertise, bringing the front line of data science into the classroom. Says Rappa, "Isolating analytics and data science in the business school is the surest way to kill it."
There is no question that data science is emerging as a unique discipline within universities around the country. Where data science ultimately resides within the university will influence those graduates' ability to compete in the major leagues of data science.
Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)