IBM 'System S' Promises New Era of Stream Processing
'Perpetual analytics' touted as the dynamic, real-time future of forward-looking analysis.
High-end analytics offer the power to predict, but those predictions may be based on warehouse-resident data that is hours, days or even weeks old. Although complex event processing technologies eliminate the data latency problem, they're most often deployed in very limited, industry-specific applications. Addressing these shortcomings and hoping to usher in a new era of real-time stream computing, IBM today unveiled today System S, a new platform designed to handle instantaneous analysis of hundreds or even thousands of high-volume data streams.
"We started from scratch and looked at the mathematics of the analytics, the programming language and the way in which applications are structured," says Nagui Halim, the chief scientist behind System S. "The difference with System S is that the analytics are much more advanced and the applications are much more sophisticated in terms of what you can look at and how you express the programs."
To be marketed under the product name InfoSphere Streams, System S is designed to support forward-looking "perpetual analytics" based on analysis of up to 6 gigabytes per second or 21,600 gigabytes per hour -- the equivalent of all the Web pages on the Internet. What's more, these analyses are continuously refined and dynamically react as data sources and underlying trends change.
"As the applications are processing, they can change how they operate," Halim explains. "For example, as streams of data appear or disappear, we can introduce compensating actions and do [data] source selection on the fly. You can also send feedback to earlier parts of an application so it can dynamically tune how it processes the data."
In development at IBM Research since 2003, System S is said to be both scalable – from laptops to exotic supercomputers -- and broadly applicable to industries such as manufacturing, retail, transportation, finance, and security and surveillance. The sweet spot for deployments will be on commodity clustered servers in the 10- to 50-blade range. The immediate focus will be on many of the same applications targeted by CEP vendors, including trade surveillance, fraud detection, market making and program trading applications at financial institutions.
Among the early beta customers of System S is TD Securities, which is said to be using the technology to ingest more than 5 million bits of trading data per microsecond to make faster financial trading decisions. Uppsala University and the Swedish Institute of Space Physics, meanwhile, are using System S to predict “space weather” such as solar winds that can have an impact on communications, energy transmission over power lines, airline and space travel, and satellites.
System S frees mathematicians to employ sophisticated analytic techniques such as micro clustering or support-vector machine analysis without proprietary restrictions, Halim says. And to allay fears that an entirely new platform might discourage would-be developers, IBM is making System S trial code available at no cost, and it will also offer developer tools, adapters and software for testing applications. The new development language, called Spade, is easy to pick up quickly, Halim says.
"We've worked with clients to help them learn the language, and we've found that within two to four weeks they can become productive," Halim says. We haven't found it to be a big barrier to entry because it employs familiar ways to express how the information is handled."
IBM also announced today that it will open an IBM European Stream Computing Center in Dublin, Ireland, to provide customer support, testing and research capabilities for prospective European customers.
About the Author
You May Also Like