Two Cornerstones of Oracle's Database Hardware Strategy
After several months of careful optimization, Oracle managed to pick the most inconvenient day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways...
After several months of careful optimization, Oracle managed to pick the most inconvenient* day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways being:
Oracle thinks flash memory is the most important hardware technology of the decade, one that could lead to Oracle being "bumped off" if they don't get it right.
Juan believes the "bulk" of Oracle's business will move over to Exadata-like technology over the next five to ten years. Numbers-wise, this seems to be based more on Exadata being a platform for consolidating an enterprise's many Oracle databases than it is on Exadata running a few Especially Big Honking Database management tasks.
And by the way, Oracle doesn't make its storage-tier software available to run on anything other than Oracle-designed boxes. At the moment, that means Exadata Versions 1 and 2. Since Exadata is by far Oracle's best DBMS offering (at least in theory), that means Oracle's best database offering only runs on specific Oracle-sold hardware platforms.
*E.g., I was sitting upstairs in my parents' apartment in Columbus, OH, having the call while their doctor, who I've never met, was visiting downstairs. He offered to make a special trip back Saturday afternoon because he missed me Wednesday, but he's notorious for not coming when he says he will.
Other high- and lowlights of our conversation included:
Flash is the main new hardware element in Exadata Version 2. Otherwise, Exadata 2 is just an annual refresh of Exadata Version 1 to include updated components (Nehalem chips, bigger disk drives, etc.).
Juan thinks it's suboptimal to use flash memory through the bottleneck of disk controllers, favoring PCIe cards instead. (I emphatically agree.)
Juan resolutely ducked questions about actual Exadata production deployment. Literally the only fact he shared in that regard is that there are at least two Exadata production systems running that each have two or more racks cabled together.
Juan stressed that Exadata runs apps written over Oracle DBMS unchanged.
Juan pointed out that in major OLTP apps such as ERP systems, there often is actually more processing going on in reporting and other batch stuff than there is in true OLTP.
Exadata 2's flash memory is designed as a disk cache, smarter than LRU (Least Recently Used). The two examples Juan gave of "smarter than LRU" are that backups and table scans don't flush the cache.
I forget whether this is new in Exadata 2 (I think it is), but anyhow -- Exadata has a "Storage Index" that's a lot like a Netezza zone map. I.e., for each megabyte or so of data, it stores the min and max value of every column; if a query predicate rules out those ranges, that megabyte is never retrieved.
Oracle has long offered what sounds like flexible workload management capability, and this has now been extended to specifically include I/O resources on the storage tier.
This isn't Exadata-specific, but Oracle has built a file system on top of its DBMS, optimized for speed, which helps with, e.g., ELT (Extract/Load/Transform). Evidently, it's not at all the same thing as Mark Benioff's 1990s Microsoft-annoying IFS (Internet File System) project, which seems to have morphed into a content management SDK.
Highlights specifically in the area of parallelization included:
Juan stressed that all databases consolidated onto an Exadata machine are/should be striped across all storage units.
On the other hand, Juan said that different databases should be confined to specific cores or CPUs on the database tier.
But on the third hand, Juan also stressed -- in what could be called a "private cloud" pitch -- that there's great elasticity as to which databases are matched to which server CPUs.
However, Juan says that what I regard(ed) as a major objection to Oracle's database-tier parallelization -- the need to manually specify "degrees of parallelism" -- has now been obviated by automation. Juan thinks that few data warehouse DBAs will now need to manually tune parallelism, with minor exceptions. One exception he cites is that if a nightly report really is non-urgent, it can just be forced to run on a single core with no chance to grab more resources. (However, Juan thinks manual tuning of parallelism will continue to play a greater role in OLTP.)
OK. That's all I can get done tonight (see above re: inconvenience of timing). Follow-on subjects I'd like to and indeed plan to post about include:
What Juan said about hybrid columnar compression
Oracle's delightfully non-confidential slide deck, and a few comments about same.After several months of careful optimization, Oracle managed to pick the most inconvenient day possible for me to get an Exadata update from Juan Loaiza. But the call itself was long and fascinating, with the two main takeaways...
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.