Are you one of the millions of users addicted to King Digital Entertainment's Candy Crush? The game saw huge growth when it hit Facebook in 2012, but usage really exploded as the game took off on mobile devices last year. The challenge for King has been dealing with all that data and making sense of the user experience.
The London company stores its raw data on a Cloudera-based Hadoop cluster, so scalability wasn't a problem. "But we were missing exploratory query capabilities," said Andy Done, data platform lead at King. "Hive is great for crunching through huge volumes of data, but when you want to explore and get a feel for your data, it's not responsive enough."
King realized that much of its data was reasonably well structured. Its first attempt to fill the data-analysis gap was to add the InfiniDB database alongside Hadoop. Moving subsets of data to this database opened up plenty of SQL analysis capabilities, but within a matter of months database performance lagged as workloads scaled toward 100 terabytes, according to Done. By this time, King's Hadoop cluster had surpassed 1 petabyte.
[Want more on big SQL analysis? Read Oracle Joins SQL-On-Big-Data Bandwagon.]
In mid-2013, King went back to the market and came across Exasol, a German database management system vendor. Though little known in the US, Exasol has more than 300 customers in Europe, and it's 10-year-old database was among the earliest to embrace technologies such as columnar compression, massively parallel processing, and in-memory analysis capabilities. (Exasol opened an office in San Francisco early this year with plans to build North American sales.)
After a successful proof-of-concept project, King brought Exasol into production about a year ago, and it now moves its hottest, most valuable data into this database, which is scaled to handle nearly 100 terabytes.
The routine applications for Exasol include executive-level reporting and dashboards, but the primary customer-facing analysis goal is analyzing the gaming experience. King studies where players might be breezing through the game and getting bored and where they're getting stuck and having a frustratingly hard experience that might lead them to give up.
"We try to ensure that there's a balance between being challenging and being fun, but one of the things we found was that level 65 in Candy Crush was notoriously difficult," says Done. "We changed the game accordingly to make it slightly less taxing at that level."
King also uses Exasol to study differences between play, customers, and the gaming experience online, on Facebook, and on mobile devices. This gets back to the success of the business and customer habits in playing the game on multiple platforms.
In the year since King brought Exasol into production, a profusion of SQL-on-Hadoop options have emerged. Most of these options would allow King to conduct analysis directly on top of Hadoop, but Done said the company is content to use the parallel connections available between Exasol and Hadoop.
"We've looked at some of the SQL-on-Hadoop offerings, but in our assessment, they're not mature enough to meet our use case as yet," he said. Cloudera Impala was one of the SQL-on-Hadoop candidates that King considered, but it was crossed off the list due to memory and insert-and-merge limitations.
Though in-memory analysis is one of Exasol's attractions, Done said RAM is only 5% of the total storage capacity of King's database deployment. "We get exceptional performance with the hot data that's in RAM, but the rest of the data that's on disk is also readily accessible, and it's not a huge restriction on query speeds."
King is content for now, but as the company has previously discovered, workloads change and grow, so its approach is "to remain as flexible as possible," said Done. "A huge amount of innovation is taking place in this space, and that arms race is hugely beneficial for us, because we'll all benefit from better technology for helping us with our data problems."
In its ninth year, Interop New York (Sept. 29 to Oct. 3) is the premier event for the Northeast IT market. Strongly represented vertical industries include financial services, government, and education. Join more than 5,000 attendees to learn about IT leadership, cloud, collaboration, infrastructure, mobility, risk management and security, and SDN, as well as explore 125 exhibitors' offerings. Register with Discount Code MPIWK to save $200 off Total Access & Conference Passes.Doug Henschen is Executive Editor of InformationWeek, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of ... View Full Bio