Cloud // Software as a Service
News
12/18/2013
10:34 AM
Connect Directly
Twitter
RSS
E-Mail
50%
50%

Amazon Kinesis: Fast Analytics On Streaming Data

AWS Kinesis service takes in thousands of data streams, processes them on an Amazon cluster, and offers results in near real time.

Top 10 Cloud Fiascos
Top 10 Cloud Fiascos
(click image for larger view)

Kinesis, Amazon Web Services' new service for processing a high volume of real-time data, such as that pouring off a stock ticker, is open for business. The system was announced, but not made generally available, Nov. 14 during AWS's Re:Invent event in Las Vegas.

A customer can start out feeding kilobytes of data into Kinesis and move up to terabytes over the course of an hour, depending on the demands of the real-time data stream. Streams from hundreds or thousands of sources, such as social media, investment research services, or news services, can be added to an original stream, allowing Kinesis to show correlations between real-time events.

Breaking news items, such as a report that the anchovy harvest has failed off the coast of Chile, can have a big impact on trading at an exchange like the Chicago Board of Trade. Likewise, companies could track Twitter, Facebook, and Google+ traffic following business announcements, such as the close of a favorable quarter or a product line addition.

Kinesis is available through AWS's US East-1 complex in Ashburn, Va., but will be rolled out to other Amazon regional data centers in 2014.

Applications built to use Kinesis can produce near real-time dashboards, alerts, and reports that can drive real-time business decision making, such as whether to change pricing on a hot-selling product or whether to adjust an advertising strategy, according to Terry Hanold, VP, AWS cloud commerce.

Kinesis applications could collect data from server logs in real time and analyze what's happening on a website during a busy holiday shopping period, or collect data on dozens or hundreds of devices on the factory floor to spot where the next delay might occur.

One reason to do data-stream analysis in the cloud is that such a service can elastically expand to meet the data streams' demands. Hanold said in the announcement that customers can capture data streams with a few clicks on the Amazon management console or by programming an application with a simple API call.

Enterprise developers often develop such systems themselves, using open source Hadoop or other resources. But Hadoop 1.0 and data warehouses tend to need time to upload data, analyze it in batch mode, and report on the results. Real-time data feeds have not been a fit, although Hadoop 2.0 may change that.

[Want to learn more about Hadoop as a streaming system? See Hadoop 2.0 Goes GA: New Workloads Await. ]

Kinesis can absorb data feeds, perform analysis on them, and then route them to Amazon's Redshift data warehouse service, DynamoDB database system, or S3 object storage. It can use load balancing and elastic scaling to create clusters to host the data streams fed into it. It can also work with Amazon CloudWatch to supply throughput, latency, and utilization statistics back to the management console.

Artist's rendering of India's Mars mission. (Source: Wikimedia Commons)
Artist's rendering of India's Mars mission. (Source: Wikimedia Commons)

Khawaja Shams, a scientist at the NASA Jet Propulsion Laboratory, took the stage at Re:Invent Nov. 14 to say he had tested Kinesis by plugging in a Twitter stream of data and asking Kinesis to determine the utilization of the word "Mars." Shams hoped to measure the popularity of space exploration after India launched a mission to Mars. But he learned that the "Mars" that appeared most frequently in Tweets was Bruno Mars, the singer, not the planet. Following up, he was able to learn that the largest concentration of the singer's fans is on the West Coast. It wasn't the information he originally sought, but he had discovered a power of Kinesis and found ways to query it.

Amazon hopes Kinesis will become a way for developers to add real-time analytics to their applications, letting Kinesis and EC2 scale the system as needed. With such a service , a developer could collect and analyze very large amounts of data without needing to know a lot more than an API call. "It does the heavy lifting so you don't have to," said Shams.

Charles Babcock is an editor-at-large for InformationWeek, having joined the publication in 2003. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week.

You can use distributed databases without putting your company's crown jewels at risk. Here's how. Also in the Data Scatter issue of InformationWeek: A wild-card team member with a different skill set can help provide an outside perspective that might turn big data into business innovation. (Free registration required.)

Comment  | 
Print  | 
More Insights
Comments
Newest First  |  Oldest First  |  Threaded View
danielcawrey
50%
50%
danielcawrey,
User Rank: Ninja
12/18/2013 | 8:17:35 PM
Re: Amazon Kinesis
I'm sure that Amazon has already tested out the Kinesis functionality quite deeply. And I can see how there would be a number of applications for this. 

Social media, ecommerce and real time communications are ripe for this type of tech. And what's the point of build out a proprietary system when Amazon's already got it covered?
Laurianne
50%
50%
Laurianne,
User Rank: Author
12/18/2013 | 1:26:06 PM
Amazon Kinesis
This makes me wonder how Amazon itself is using this powerful analytics capability. And think of the possibilities for airlines. Maybe not such good news for travelers seeking low prices?
8 Steps to Modern Service Management
8 Steps to Modern Service Management
ITSM as we know it is dead. SaaS helped kill it, and CIOs should be thankful. Hereís what comes next.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.