PureData System for transactional, analytic, and operational deployments takes on Oracle, replaces IBM's Netezza appliances and Smart Analytic system.
IBM announced on Tuesday PureData System, an expansion of the vendor's PureSystems family of integrated systems combining hardware and software. The move comes just days after Oracle introduced the latest releases in its "Exa" engineered systems lineup, timing that is clearly no coincidence.
The PureSystems family was introduced in April with PureFlex and PureApplication offerings designed to provide ready-to-run computing infrastructure (including compute power, storage, networking, and systems management), and a ready-to-run, middleware-inclusive application platform, respectively. The PureData System announced on Tuesday adds three platforms designed for data-intensive transactional, analytic, and operational analytic workloads.
The obvious competitive comparison to PureData System is Oracle Exadata, which is a database appliance that can be configured for transactional or analytic use. Where Oracle touts its engineered systems as offering "hardware and software engineered to work better together," IBM touts PureSystems as offering "expert-integrated" hardware and software based on pattern-based deployment. What, exactly, does that mean?
"The expertise is built into the system in terms of how they're configured, assembled, and integrated at one level, but we also define pattern-based deployments of the software stack and have systems in place to simplify the management of that software," said Bernard Spang, IBM's director of strategy and marketing for database software and systems, in an interview with InformationWeek. By relying on known deployment patterns, system setup times can be cut from days to minutes, Spang claimed.
Fast deployment is also promised by Oracle with its various Exa products, so it's best to check with reference customers who have completed similar deployments. The PureData System lineup will clearly go head to head with Exadata first and Microsoft appliances and analytic alternatives from the likes of Teradata, EMC (Greenplum), and HP (Vertica) second.
IBM's new system includes three distinct offerings: PureData System for Transactions, PureData System for Analytics, and the PureData System for Operational Analytics. Each offering includes entry-level configurations that can be scaled up on IBM servers running industry standard X86 Intel chips. The system for Transactions is also available based on IBM Power servers running AIX, IBM's flavor of the Unix operating system.
As the name suggests, the PureData System for Transactions is optimized for online transactional processing (OLTP) applications. IBM has mostly acted as an integrator of such systems, rather than acquiring enterprise applications, so it's working with third-party vendors including SAP and Infor. IBM says more than 200 solutions and applications are part of a PureSystems partner program and optimized to run on the platforms.
PureData System for Analytics is the next-generation upgrade and replacement of the IBM Netezza analytic appliance lineup. PureData System for Analytics runs on the Netezza database and expands on in-database processing options to deliver more than 200 options for advanced algorithmic processing, data prep, scoring, modeling, and geospatial analysis inside the database. It's the largest library of in-database analytic functions on the market, according to IBM, including third-party options from SAS (for scoring), Fuzzy Logix, Revolution Analytics (for R-based analytics), Zementis (for PMML), and ESRI (for geospatial analysis).
PureData System for Analytics is said to deliver 20 times higher throughput and concurrency than the previous generation of Netezza appliances, which translates to fast performance while handling many more simultaneous queries, according to Phil Francisco, IBM's VP of big data product management.
PureData System for Operational Analytics is the next-generation replacement of IBM's Smart Analytic System. Built on IBM's DB2 database and InfoSphere Warehouse software, this is IBM's enterprise data warehouse or operational data store offering designed to support thousands (or even tens of thousands) of users issuing thousands of concurrent queries.
The Operational Analytics platform delivers real-time performance by way of a continuous data-ingest feature. This parallel data-loading feature brings the latency of analysis down to seconds or even sub-seconds, according to IBM. This real-time performance is required for applications such as fraud detection during credit card processing, cross-selling or up-selling during call-center contacts, or tracking of changes in supply chain or utility demand as they're happening. Other advances over the previous-generation offering include adaptive compression and automated workload management features that reduce administrative burdens, according to IBM.
Where Exadata is one appliance that can be configured for all three scenarios described above, IBM says its separate, preconfigured offerings can be deployed and ready to run within as little as 24 hours. But one feature lacking from IBM's analytic offerings is hybrid columnar- and row-based storage and compression. Variations on this hybrid feature have been introduced in recent years by EMC Greenplum, HP Vertica, and Teradata. Oracle followed suit last week, announcing that the 12c database set for release early next year will feature hybrid columnar and row-based storage and compression.
IBM says its alternative approach to compression delivers competitive performance that reduces storage requirements and, with it, data center floor space demands. "Our adaptive approach applies compression to each individual column within the database, automatically applying one of five techniques based on the character of the data for maximum compression," Francisco explained.
IBM also made no mention in its PureData announcement of database virtualization features, as promised by Oracle, or of in-memory analysis options as promised in Oracle 12c and already available from SAP with its Hana database.
The new PureData Systems for Analytics and Operational Analytics include connectors to Hadoop and big data environments including Cloudera Hadoop clusters and IBM Big Insights (Hadoop) and IBM Streams (real-time processing) deployments. IBM runs its big data software on suggested configurations of hardware rather than offering Hadoop-ready appliances as do competitors Oracle, EMC, and, soon, HP. And unlike Teradata, IBM has not yet announced support for HCatalog, a development that promises access to Hadoop data without actually moving the data out of a Hadoop cluster.
For customers who aren't running Oracle applications and who aren't stuck on Oracle database as the basis for analytics or data warehousing, the IBM PureData System delivers a comprehensive set of optimized data platforms. IBM is counting on faster deployment and better compatibility with third-party software as a key differentiator versus the competition.
IBM PureData System will ship at the end of October, at which time IBM said it will detail product pricing. For now the pricing guideline is that the PureData System for Analytics will start at less than $500,000.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
InformationWeek Must Reads Oct. 21, 2014InformationWeek's new Must Reads is a compendium of our best recent coverage of digital strategy. Learn why you should learn to embrace DevOps, how to avoid roadblocks for digital projects, what the five steps to API management are, and more.