Google Compute Engine Leverages Third Party Support
RightScale, MapR, and Puppet Labs bring key features, a larger ecosystem to Google's new infrastructure-as-a-service.
For example, Google's IaaS is designed to run KVM workloads, which works fine for the many startups and independent developers who have built their applications on Google App Engine and are not running them in any virtual machine. They can submit their job to Compute Engine, select a type of server, and let it provision a KVM virtual machine for them.
But many established businesses are already running virtualized applications and are ready to move a discrete workload, configured the way they want, to public IaaS. Amazon Web Services' EC2, for example, prefers tasks submitted in its Amazon Machine Image format, but accepts VMware virtual machines and converts them. Microsoft's Windows Azure is designed to run Hyper-V, but can accept both VMware and Citrix XenServer jobs.
For Google Compute Engine to do the same, it will need the help of third party RightScale, which provides a job configuration front end than can translate between the different virtual machine formats. RightScale has produced over 40,000 Linux and Windows server templates that can be browsed by a customer. After selecting one, it sets the operating system and application combination that can be submitted to a particular cloud service, including Amazon EC2 and, now, Google's Compute Engine.
[ How does Google's new IaaS compare to Amazon Web Services? See Google Compute Engine: Hands-On Review. ]
Michael Crandell, CEO of RightScale, said his firm looked over the features of Google's pending IaaS and was impressed with its speed of booting up servers and its ability to establish encrypted, private-line communications between virtual machines. The virtual machines may be in different geographic locations, but the connections between them "appear as a local area network, from the system adminstrator's point of view," he said in an interview.
In addition, data written to storage by Compute Engine is also automatically encrypted, giving its operation an additional security feature that Crandell thinks will be attractive to future cloud users. So RightScale signed up to support its KVM infrastructure, and customers who might otherwise be turned off by the use of KVM may go through RightScale--for a fee--to have their workloads targeted to Compute Engine.
Likewise, performing analytics on big data is one of the cloud's attractions. Google, as the inventor of Big Table and MapReduce, should be able to attract big data users in the long run. But it helps that MapR, an implementer of analytics on open source Hadoop, has a system ready for use on Compute Engine.
MapR is a commercial implementation of the Apache Software Foundation's Hadoop. At Google I/O, a 1-TB sort, or TeraSort, job was completed in 80 seconds on a Compute Engine cluster of 1,256 nodes and 1,256 disks, at a cost of $16.
Although he didn't tie the project to MapR or Hadoop, Google senior VP Urs Holzle, one of the primary architects of Google's search data centers, appeared at Google I/O June 28 to illustrate how Compute Engine can be used for Hadoop-style parallel processing of a big data problem.
In this case, the big data problem was finding characteristics and attributes in what is known about cancer patients and associating those findings with specific genes or gene mutations that are known in the human genome. An algorithm used by the Institute for System Biology in Seattle sifts through the human genome, looking for associations between what is known about specific genes and the attributes of cancer patients.