March Madness: The Analytics Behind the Dance Card - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Data Management // Big Data Analytics
Commentary
3/22/2017
07:00 AM
Jessica Davis
Jessica Davis
Commentary
Connect Directly
Twitter
RSS
50%
50%

March Madness: The Analytics Behind the Dance Card

It may be impossible to predict the perfect bracket, but these academics have managed to predict perfectly the "at large" bids that were included in the March Madness NCAA college basketball tournament this year and with 96% accuracy over the last 6 years.

(Image: Brocreative/Shutterstock)

(Image: Brocreative/Shutterstock)

The dance card choices are made, the brackets are filled out, and we are already several days into the annual March Madness college basketball tournament. He may not have a perfect bracket, but Jay Coleman correctly predicted the "at large" bids for this year's tournament, and has a 96% correct prediction rate over the last 6 years. How did he do it?

Coleman and Mike DuMond have collaborated on predicting the dance card since 2000.

Coleman is a professor of operations management and quantitative methods at the University of North Florida and DuMond is a vice president at Economists Inc. and an adjunct professor at Florida State University. Each year since 2000 they have used analytics (and software from SAS, the sponsor of this site) to create their dance card prediction.

Coleman creates a "dance card" that predicts the 36 college basketball teams that will be invited to "at-large" spots in the tournament. These are teams that didn't perform well enough last year (by winning their conference or conference tournament) to get automatic invitations. Instead these teams are invited to participate by the 10-member NCAA Tournament Selection Committee, which is comprised of university athletic directors (but no analytics experts, Coleman told me). The committee decides the best at-large teams for the remaining 36 tournament slots.

Coleman, DuMond, and another collaborator, correctly predicted 36 of 36 at-large bids this year, and now have correctly predicted 209 of 218 at-large bids over the last 6 years combined (96% correct). Since they began their predictions in the year 2000, they have gained quite a bit of notoriety, doing something like 120 media interviews.

The first year they did it, in 2000, Coleman and DuMond looked at the data they had from 1994 and 1999 and published their results in an academic journal. On a lark, Coleman called the local television station on the eve of the committee's "at-large" bids announcement of teams to tell them about their prediction.

"We turned out to be the lead story," Coleman said.

The original prediction was based on team performance statistics, but the model has since evolved. Now they rely on the RPI or ratings percentage index, which Coleman says is the ubiquitous stat for college basketball.
"The NCAA came up with it to help them rank and categorize teams and help advise the selection process, and it is still in use," Coleman said. "There are better and more advanced analytics out there, but by and large, our research shows (the members of the committee) don't use it."

What other secret inputs does the committee consider when choosing at-large teams? A subsequent academic paper by Coleman and DuMond found evidence suggesting bias in the selection process. The results showed that perhaps members of the committee who had ties to particular teams were more likely to include those teams in the tournament. Coleman told me the idea of bias surfaced when the actual "at-large" selections didn't exactly match up to what they should be if the committee had been following its usual formula. Since then Coleman has noticed less of a bias issue than had existed before, although "there is still evidence that there may be a lingering bias" in favor of some teams in the Pac 12 conference.

I also asked Coleman if he filled out a bracket himself, and who he has to win the tournament. Coleman told me he no longer fills out a bracket because it's just too frustrating. Even when he applies analytics to determine his picks, he's only as likely to win as the person using a dartboard, jersey color, or fuzziness of mascot to make their choices.

He has applied his dance card model to the bracket, but the bracket just really defies prediction.

"There's just enough of a random element that you will never have enough data to nail it down completely," he said. "If a predictive model predicts at about 75%, maybe 80%, that is the upper end of your accuracy."

Following his dance card model for the bracket, Coleman's finalist team, Villanova, has already been eliminated. Following my own personal system that I invented on March 15 to fill out my first-ever bracket (has a good journalism school), my final team, Northwestern, has also been eliminated.

 

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Slideshows
Reflections on Tech in 2019
James M. Connolly, Editorial Director, InformationWeek and Network Computing,  12/9/2019
Slideshows
What Digital Transformation Is (And Isn't)
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/4/2019
Commentary
Watch Out for New Barriers to Faster Software Development
Lisa Morgan, Freelance Writer,  12/3/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
The Cloud Gets Ready for the 20's
This IT Trend Report explores how cloud computing is being shaped for the next phase in its maturation. It will help enterprise IT decision makers and business leaders understand some of the key trends reflected emerging cloud concepts and technologies, and in enterprise cloud usage patterns. Get it today!
Slideshows
Flash Poll