Cloud // Software as a Service
News
9/17/2013
05:11 PM
Connect Directly
LinkedIn
Twitter
Google+
RSS
E-Mail
50%
50%

Google BigQuery Adds Data Streaming

Developers and businesses can now analyze data as it gets entered into BigQuery.

Google's BigQuery, a Web service for analyzing large amounts of data, is about to become more efficient in order to gain insight into data subsets and to refresh its interface.

On Wednesday, Google plans to introduce several new features: support for real-time data streaming in BigQuery, the ability to query portions of a table, the query functions SUM and COUNT, and interface improvements designed to enhance productivity.

BigQuery was launched last year as a tool for interactive data analysis. It's not a database, like Google Cloud SQL. Rather it brings MySQL-style querying to a NoSQL datastore.

With additional tools, Hadoop clusters can be deployed to query multi-terabyte datasets, but the resulting system probably won't return query results as rapidly.

[ Surfers are voting with their clicks. Read Online Ad Blocking Spreads. ]

Raj Pai, CEO of social analytics company Claritics, said in a Google case study that time-consuming complex queries of large data sets on Hadoop clusters can be processed by BigQuery in as little as 20 seconds. As a consequence, his company has been able to develop apps four times faster and to spend about 40% less time focused on IT infrastructure.

Similar offerings from other companies include Amazon Elastic MapReduce, IBM BigInsights and Microsoft Azure HDInsight.

While BigQuery was designed to perform SQL-like queries on large datasets quickly, speed can still be an issue: It can take a long time to move large amounts of data into the cloud.

That's where real-time data streaming comes in. Developers and businesses can now stream data row-by-row using a new API call. This allows data processing to begin immediately, rather than uploading to a cache for batch processing.

Google is offering streaming ingestion for free until Jan. 1, 2014. Thereafter, streamed data will be billed at a rate of $0.01 per 10,000 rows inserted. Batch-based data ingestion will remain free.

Querying subsets of a table can now be done with the addition of a "table decorator" to an SQL statement. These are limited to data inserted within the last 24 hours.

Beyond the cost benefits of concise queries, Google product manager Ju-kay Kwek said in a blog post that table decorators can be used in conjunction with real-time data streaming to do things like monitor user activity during a recent time period, such as the introduction of a Web app update.

The new SUM and COUNT functions expand BigQuery's statistical capabilities. And BigQuery interaction has been enhanced with an expanding information panel that provides more detail about queries and with action buttons at the bottom of the query box.

Kwek said in an email that Google sees BigQuery being used by a wide range of industries, including e-commerce, retail, logistics and operations. Service partners such as PA Consulting and Saama Technologies have also helped companies in specialized industries like healthcare implement BigQuery.

Kwek said BigQuery and Amazon Elastic MapReduce (EMR) serve different functions. "BigQuery is well suited for businesses who need to analyze large amounts of data in an ad hoc and iterative manner, who can't or don't want to build and manage a lot of technical infrastructure," he said. "EMR is very different; as a general purpose framework for running MapReduce jobs it's powerful and flexible, but requires significant investments in infrastructure and management."

Google doesn't disclose user figures for specific services, but the company says it has 3 million active applications running on its Cloud Platform and about 300,000 unique developers using its services every month.

Comment  | 
Print  | 
More Insights
Comments
Oldest First  |  Newest First  |  Threaded View
D. Henschen
50%
50%
D. Henschen,
User Rank: Author
9/19/2013 | 1:07:50 AM
re: Google BigQuery Adds Data Streaming
I'm not clear on how/why a cloud-based service like Amazon Elastic Map Reduce "requires significant investments in infrastructure and management." Isn't the whole idea of a cloud service NOT requiring infrastructure. If Google is talking about analytical tools to run on top of Hadoop, there are options on AWS including Karmasphere, just as Google has BI/analytics partners on BigQuery. Is there really zero infrastructure or management behind Google's offering?
8 Steps to Modern Service Management
8 Steps to Modern Service Management
ITSM as we know it is dead. SaaS helped kill it, and CIOs should be thankful. Hereís what comes next.
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Tech Digest - July 22, 2014
Sophisticated attacks demand real-time risk management and continuous monitoring. Here's how federal agencies are meeting that challenge.
Flash Poll
Video
Slideshows
Twitter Feed
InformationWeek Radio
Archived InformationWeek Radio
A UBM Tech Radio episode on the changing economics of Flash storage used in data tiering -- sponsored by Dell.
Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.