For database vendors and customers alike, success depends on affordability, manageability and speedy analysis.
For practitioners, big-data success could be the key to survival or opening up vast new markets. Among vendors, the incumbents have the most to lose. But fresh survey data suggests that MySQL may be a spoiler for Oracle.
In case you missed "Fast and Big" in the August 9 issue of InformationWeek, you can download it here and read my trend analysis on in-database processing, in-memory innovations and new(ish) alternatives including MapReduce and Hadoop. Here I share some additional observations along with details from a ParAccel customer interview that wasn't completed in time for the feature.
I wrote a lot about Barnes & Noble in my story, and important context recently emerged when the company announced it's considering putting itself up for sale. The Aster Data deployment I wrote about has everything to do with the company's current digital predicament.
Barnes & Noble's new CEO, William Lynch, ran the BN.com site before taking the company's top post about a year ago. Marc Parrish, the VP of retention and loyalty, made it very clear to me that the company is bent on evolving from a brick-and-mortar retailer into a technology-led firm.
As various news accounts have described, BN faces a tough transition. If the company succeeds in joining the e-reading craze with its Nook and smart phone e-readers (Parrish says the company already has 20% of that market), it will have to figure out how to keep its stores in business with fewer retail book sales.
The company hopes to find a happy mix across an ecosystem that includes stores, in-store cafes, e-book downloads, affinity club membership and the BN.com Web site -- thus the enterprise data warehouse (EDW) deployment that replaced nine separate Oracle warehouses that provided siloed domains of analysis.
BN is cutting back on books in its stores and restocking with toys, games and other items that can't be downloaded through e-readers. So now, e-mail campaigns might include a coupon for something other than a book to lure you to visit a store. Once you're there, BN stores now offer free WIFI to entice you to linger. Maybe you'll buy a cup of coffee, one of those toys or games or, in a fit of nostalgia, a good old-fashioned physical book.
"We know from our analysis that if people come in and buy something at the cafe, their average order value in the store goes up," said Parrish. "And when they start buying books on the Nook, we'll be able to tie that insight into the whole ecosystem. That's the thing we're working hardest on -- getting algorithms that are cross-channel."
My latest big-data customer interview provides an example wherein potentially huge business opportunities are being uncovered.
Provisio, a medical-research support firm based in Tennessee, needs to quickly query health claims and medical records on more than 41 million U.S. citizens. (Don't worry, identities are abstracted and the database is HIPAA compliant, though Provisio can contact patients indirectly through their doctors.)
Provisio was struggling with long-running queries on a Microsoft SQL Server cluster deployment. Late last year, it switched to a ParAccel database running on HP servers.
ParAccel provides both a column-store approach and massively parallel processing. The product's compression capabilities have dramatically reduced Provisio's formerly 7-terabyte database and corresponding storage needs, according to Sean Harrison, the company's chief security officer and senior information architect. He says total costs including hardware and database licenses were in the $150,000 range.
Most importantly, a drug trial "site proximity" analysis that used to take a week or more on the old platform now takes 10 minutes and can be handled customer-self-service style through the company's iTrails Web site.
"We're not only doing what we used to do much faster, we're dreaming up new services," says Harrison. Spotting disease hot spots by zip code is one possibility, he says. Another idea is helping life insurers or health insurers fine-tune rates by zip code based on statistical disease frequencies or other factors, such as occurrences of industrial accidents.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.