Readers had plenty to say when we called for university CIOs to bear down on using and teaching big data. Can you learn from the example of Bill Gates' high school teacher?
Big Data Analytics Master's Degrees: 20 Top Programs
(click image for larger view and for slideshow)
My first InformationWeek column on big data incited some hot responses, such as: "Who has the time to waste working with software that is so ill featured and unreliable that it is like something out of the '80's?" and "If certain data is important to an institution, then that data belongs in the data warehouse, not in some newfangled database that doesn't even impose validity constraints." The meanest comment was, "Big data is a fad and you're just a shill for the vendors who created the fad because they had run out of things to sell us."
A few of the CIOs who responded confessed that they did not know much about the software and tools I had cited, and they sure didn't have a clue as to why their faculty wanted these tools. I tried to help by pointing them to some explanatory websites for Hadoop, MapReduce, Cassandra and R.
The responses I found most intriguing were from CIOs who said that they had already taken steps to provide their faculty with access to big data tools, but now they are itching to expand their own role with these new technologies by utilizing big data analytics to improve IT operations.
I asked what was stopping them, and they said: "I have no money" and "My staff is already overwhelmed dealing with clouds and mobile and virtualizing and distance learning and every other game changer that's been happening in higher ed technology. How can I tell them they now have to learn Hadoop?" A fair question, given the sad state of Hadoop's interface -- a question that can't be answered just by me sharing links to websites.
My advice to them was to first identify the pain points in IT operations that happen to be awash in unstructured data. A pain point might be trying to keep the network secure. Or perhaps they are getting too many complaints about network performance or reliability. These pain points do not suffer from a lack of data. The log files produce so much data that staff can barely skim them each day. This massive amount of data creates its own problems, since the attention one can pay to a warning signal is inversely proportional to the number of warning signals being emitted. (I just made that up, but I'm pretty sure it's true.)
If the data for a major pain point is generated, but it is way too much to comprehend, you have a perfect big data problem!
Those CIOs who have been seeing their budgets increase each year can just hire a consulting team to come in and set up systems to tackle the pain points. Unfortunately, none of my respondents seem to be in that situation. And, they are not likely to have any data scientists on staff.
So where can higher ed IT leaders get the people to create the research design and the predictive algorithms, deploy the software and tools, mine the data, and then reach that Eureka moment, known as the "big insight"?
The experts tell us that anyone starting a big data analytics project will likely need a cross-functional team that has at least one person on the team with the knowledge and skills to use these new tools, one who has the predictive analytics skills, and one who is great at visualization. They point out that the team also needs a project manager who has superior business domain knowledge.
My respondents admitted that they do have folks with business domain knowledge, but the other skills ... no way! Yet these are cutting-edge leaders who have deployed big data tools for faculty and students, something that few higher ed CIOs have done. This means that every day in their institutions students are gaining experience in using R, Hadoop, MapReduce and a variety of visualization and predictive analytics tools.
To these CIOs I'd like to ask: Could some of these students be recruited by your IT organization to be part of a junior data sciences team? If assembling a student team seems too much of a challenge, then how about tapping into existing student teams in your school's big data programs, such as this one at George Mason, which has teams of students from many disciplines engaging in a final project involving a big data set -- which could be your set of log data!
Bill Gates was a high school student when mainframe computers first came to Seattle. His teacher set up a link with a local corporation and then organized the students into teams to learn to computer program in order to help the high school and the corporation take advantage of this new technology.
Big data projects could provide the higher ed IT leader with the opportunity to become this kind of world-changing teacher.
I look forward to your comments. If you are using big data analytic tools to address pain points in your IT department, please share! Your colleagues are especially interested in knowing how you managed to build your data sciences team, so share that, too.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business wonít wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.