Israeli company puts analytics in the cache memory of a central processing unit, reducing need for new hardware purchases.
Big data tends to mean one thing for hardware: Buy more of it.
Companies are seeing data levels as much as double every year. They need to store that data, and process it, and be able to search it -- none of which is free, even in today's world of dirt-cheap commodities.
Along comes SiSense, a company that started with the aim of doing analytics in the cache memory of a central processing unit. That's right -- analytics in a chip. That was the challenge CTO Eldad Farkash set for himself with SiSense. Farkash's other challenge was to boost performance.
Farkash's ultimate answer was that you could take huge chunks of data from relational databases and vectorize it, or slice it up, and then run analytics on the slices. It's analogous to the difference between sending data over the circuit-switched phone network and the packet-switching approach adopted for the Internet, which splits up voice and data into packets and then reassembles it.
Another way to explain what SiSense does is that it takes analytics software and makes it work on multicore processors, like a parallel computer cluster on a chip. The technology works in part because chips are now multicore -- in effect, parallel computers on a chip. Software programs optimized for such chips will run in parallel. That sounds a lot like what SiSense does in slicing datasets.
Farkash said that SiSense has run queries against as much as 20 terabytes of data running on the chip.
The technology works on Intel and AMD architectures. Farkash says it is the first company to support the Many Integrated Cores mode in Intel's Haswell architecture.
Farkash believes that such analytics will soon be something you can hold in the palm of your hand. In two or three years SiSense should work on iPad and Android tablets, though they'll be different from today's tablets, boasting about a terabyte of storage and double-digit RAM.
What CIOs need to know is that SiSense represents a divergence from the high-performance computing Hadoop Hive Mind of big data. In the Hadoop hive world, you go to IT (or the company's data scientist if you're lucky enough to have one) and beg for an hour of query time, then try to stuff as much data as possible into that time. It costs money to do that. You don't want to ask silly questions.
SiSense aims to take the worry out of asking silly questions. (Source: Flickr user Colin_K)
Farkash's premise is that business users might want to ask silly questions -- or at least questions that could have throwaway answers. "They want to know how many seconds it will take to ask a new question or modified question against a dataset," he said during a Skype interview from his office in Tel Aviv.
A typical Intel Core i3 processor will have 3 or 4 MB of cache. That's not big data. That's what you use to run Excel queries.
The approach seems to have struck a chord, given SiSense's impressive list of customers.
Tools like SiSense may be a way to spread analytics beyond today's Hadoop hives. It might even demystify analytics a bit.
Emerging software tools now make analytics feasible -- and cost-effective -- for most companies. Also in the Brave The Big Data Wave issue of InformationWeek: Have doubts about NoSQL consistency? Meet Kyle Kingsbury's Call Me Maybe project. (Free registration required.)