"We knew traditional data warehouse methods where you set up cubes and try to anticipate what kind of queries fans would run wasn't going to work, and that's where Hana came in," NBA senior VP and CIO Michael Gliedman told InformationWeek in a telephone interview.
Only basic stats were previously available on the NBA site. The deepest historical data had been managed on conventional databases that saw limited internal use. Over the last year, stat-enthusiast NBA executives -- namely commissioner David Stern and deputy commissioner Adam Silver -- called for the resource to be opened up to the fans, Gliedman said. Late last summer the NBA began the work of transforming and consolidating legacy data and building the new stats engine.
[ Want more on the use of analytics in sports? Read Analytics Drives Next Generation Of Moneyball In Sports. ]
The main reason SAP Hana was selected was that the NBA wanted to support fast, flexible querying. With the entire stat dataset held in memory on Hana, fans will be able to split, filter and query data as they see fit, Gliedman said. It's not a huge trove of data, at less than a terabyte, but conventional OLAP cubes would have confined analysis to a limited set of predefined queries.
"You can select any date range, any point within a game, and you can do things like come up with your own definition of 'clutch shooters,'" he said. "Is that two minutes before the end of the game or five minutes? You decide."
Another challenge in bringing a query tool to a high-traffic website like NBA.com is concurrency, meaning potentially high numbers of simultaneous users. NBA.com averages about six million unique visitors per month, according to Compete.com's online site analytics engine. Far fewer fans are likely to be on the stats page firing off queries at any one moment, but NBA expects as many as 20,000 concurrent users, and it has load tested the Hana-based stats engine accordingly, Gleidman said.
As for the latency challenge -- delays in gaining access to the very latest data -- NBA.com/stats is not quite "real-time," as touted in a press release, but Gliedman said stats will be available within 15 minutes of the end of each game. The delay has more to do with officials finalizing the stats than any technical delay. Even so, 15-minute latency is much faster than most businesses experience with the overnight batch-ETL (extract, transform, load) processes that are typical in data warehousing. NBA is using SAP Landscape Transformation software for continuous, rather than batch, data integration.
NBA.com/stats is launching with a modest collection of data-visualization capabilities, including shot charts and trend graphs, but Gliedman said the NBA plans to add drag-and-drop data-visualization options using SAP BusinessObjects Explorer and Visual Intelligence software. By next season NBA also hopes to integrate video capabilities that would tie statistics to related clips.
"If a query shows that a player had three steals in a game, you'll have the option to launch the videos of those three plays," Gliedman explained.
The NBA and SAP are both pushing for wider use of analytics in sports. NBA deputy commissioner Silver was a keynoter at last year's MIT Sloan Sports Analytics Conference, where he talked about the role of predictive analyses in the 2011 player lockout and league-player contract negotiations. The salary caps and revenue-sharing arrangements now common across most pro sports are based on predictive revenue projections, he said.
SAP has become much more visible in sports in recent years, and it now supplies its business intelligence and analytics technology to Major League Baseball, the NFL.com Fantasy Football site, the San Francisco 49ers and other leagues and teams.