University Data Sharing Project Takes Big Step Forward
Predictive Analytics Reporting (PAR) Framework initiative publishes its data definitions, clearing way for researchers to glean information from anonymized student data that might result in ways to improve graduation rates.
Big Data Analytics Masters Degrees: 20 Top Programs
(click image for larger view and for slideshow)
The Predictive Analytics Reporting (PAR) Framework, a nearly two-year-old project that has been aggregating student data from two-year and four-year institutions, passed an important milestone this week, releasing the common data definitions for all the variables in its database.
That database, compiled from from PAR's member institutions, now includes more than 1.7 million anonymized and institutionally de-identified student records and 8.1 million course-level records.
Launched in May 2011 by the WICHE Cooperative for Educational Technologies (WCET), the PAR Framework started with six institutions sharing data. Its goal was to identify variables that influence student retention and progression, and guide decision making that improves postsecondary student completion in the U.S. The project has since grown to 16 participating institutions, and has received $3.56 million from The Bill & Melinda Gates Foundation to date.
"Interest in analytics across the board, and learning analytics in particular, have taken higher education by storm," WCET Executive Director Ellen Wagner told InformationWeek.
Attention to the topic of educational outcomes in postsecondary education have increased as graduation rates in the U.S. have declined. According to the Department of Education, of those who enroll in a higher education program, only about 55% graduate within six years.
Although the PAR Framework isn't the only effort investigating educational outcomes, it might be the most granular. That's because its data set, with some 60 variables, includes not only whether a particular student passed a course, achieved a major or dropped out, but it has down- to-the-course-level details from student records.
Capturing this structured data hasn't been easy. Even the seemingly routine "grade point average" is handled differently at different institutions, Wagner said.
In the future, the database might expand to include semantic data.
The data definitions were released under a Creative Commons license to "to encourage distribution of the definitions into the higher education research community," WCET said in a statement. Until now, the PAR's data fields and definitions have been available only to the organization's 16 institutional partners.
The PAR Framework published the data definitions using the Data Cookbook, a collaborative data dictionary and data management tool built for higher education by IData.
Wagner hopes the release of the data definitions will spur interest and participation in the data-sharing project. "I would like to double our number [of participant institutions] this year," she said. "Another goal for 2013," Wagner said, would be the release of an "intervention taxonomy," which will help benchmark the efficacy of different types of educational interventions for at-risk students.
The PAR Framework has already published some research based on its preliminary data set from the proof of concept. Last summer, it published research in Journal of Asynchronous Learning showing that the higher the number of consecutive credit hours a college student takes, the greater risk of dropping out.
Institutions interested in learning more about becoming members of the PAR Framework are invited to review the requirements for PAR participation and submit an interest form.
Can data analysis keep students on track and improve college retention rates? Also in the premiere all-digital Analytics' Big Test issue of InformationWeek Education: Higher education is just as prone to tech-based disruption as other industries. (Free with registration.)
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.