Parallel VM Creation: Scaling Your OpenStack Cloud In Real-Time - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Partner Perspectives  Connecting marketers to our tech communities.
11:15 AM
Sean Cohen
Sean Cohen
Partner Perspectives

Parallel VM Creation: Scaling Your OpenStack Cloud In Real-Time

You selected OpenStack to deploy an open private cloud. Anchoring it to traditional storage is like attaching a ball and chain that robs you of its benefits. Save time and money and enable new capabilities such as instantaneous and parallel VM creation by pairing your OpenStack environment with the software-defined storage of Ceph.

“It’s all over the place…”

Time was you’d hear that phrase and think that things were disorganized, unstructured, scattered, bad. 

In the era of the cloud, that concept has been turned on its head. Now applications, workloads, metadata, and more are distributed to multiple servers in multiple data centers in multiple geographies.  In other words, “it’s all over the place!”  And that’s a GOOD thing.

Disabling Technologies

Organizations of all sizes have chosen to deploy OpenStack for a flexible cloud environment that’s built to scale out efficiently “all over the place” on commodity hardware. And while OpenStack has become the preference of many that are transitioning to cloud computing, open-source-based Ceph storage has become the preference of those deploying OpenStack. In fact, the October 2016 user survey shows that Ceph has garnered nearly 60% of the OpenStack user base for block storage.

But Ceph storage is not just about block. Ceph accommodates massive scale on standard commodity hardware, provides a unified platform for block, object, and shared file system data, and integrates tightly to OpenStack’s services. It’s a far better option than traditional storage appliances built on proprietary hardware with embedded proprietary software. That’s like buying the ultimate driving machine and putting square tires on it. The best engine available becomes hobbled. Worse, you’re running it on gasoline that’s only available from one station, so you must drive there every time you need more fuel.  

Everyday Tasks Need Infrastructure Support

Part of the value of Ceph’s tight integration with OpenStack is its ability to accommodate everyday tasks of OpenStack users without square tires. It’s totally fundamental to Ceph, but OpenStack users often ignore the distinct possibility that their storage infrastructure isn’t up to the task.

One such task is the creation of virtual machines. OpenStack users make copies of VMs frequently -- not because they’re afraid to lose them but because they want to use them quickly, such as to template their applications so they can reuse them more efficiently. Fortunately, the very nature of Ceph’s architecture addresses this requirement because data and clones of data are automatically distributed “all over the place.”

In the ball and chain, square-tired ultimate driving machine example, these clones are not distributed, instead creating a resource bottleneck that hampers their usefulness until manually copied and strategically placed by the OpenStack developer. What makes Ceph different?

CRUSH: Adding VM Scale To Scale-Out Storage

Imagine if all cable TV services were designed around the paradigm that viewers select a program that is then downloaded to their set-top-boxes and played.  Every viewing of every program would require wait time at the start.

That’s not very different from the way most virtual machines are cloned and distributed across a network.  The server running the hypervisor is instructed to make a duplicate of one of the VMs contained in its storage. It makes that copy, which takes a few minutes, then it uploads that copy across the network, which takes a few more minutes. It’s bearable if it’s just one copy, but what if you need 1,000 copies?  Suddenly those minutes turn into hours, days, or even weeks.

Similarly, many scale-out architectures require a lookup or metadata server to allow nodes to access required data. The node first talks to the lookup server, which tells it where the data resides on the cluster. Then the node retrieves the data. This can be slow and cumbersome. Ceph instead uses a novel data placement algorithm called CRUSH -- Controlled Replication Under Scalable Hashing. Each OpenStack compute node runs CRUSH, which computes a consistent, reliable place where the required data is stored. The node goes directly to the data, eliminating the need and the latency introduced by centralized controllers and lookup servers.

The storage device is spread redundantly across the entire cluster, replicating data repeatedly across distributed nodes to ensure reliability. If a node is removed, the cluster retrieves the lost data from other nodes and redistributes it among the remaining nodes. This provides extraordinary data storage scalability -- thousands of client hosts or KVMs accessing petabytes to exabytes of data.

Each one of your applications can use the object, block, or file system interfaces to the same cluster simultaneously, which means your Ceph storage system serves as a flexible foundation for all of your data storage needs. 

The copy-on-write clone feature of Ceph storage (shown above) helps OpenStack spin thousands of virtual machines on the fly, in probably less time than it takes to brew your next cup of coffee. Clones can come from a golden image stored in Glance, a running image in Nova, or an existing block device in Cinder. In all instances, the data is cloned instantaneously in Ceph, allowing the VM to have immediate access to it. Now when you request distribution of 1,000 copies of a VM for use on 1,000 servers, the image is instantly cloned on the cluster and made available immediately.

You can also take a snapshot of any clone, which can then be used to keep copies of the point-in-time status of your VMs. These snapshots can also be stored back into Glance for later booting of new VMs. Snapshot layering enables these images to be created quickly and transparently, again with no data moving the network.

No waiting time.

No reconfiguration of a variety of servers and hypervisors.

All nodes access the cluster simultaneously.

Thousands of VMs spin up instantly and effectively.

Just as you’d pair a fine wine with haute cuisine, pair the right storage solution with OpenStack. Those whom you serve will appreciate the menu.

Sean is a seasoned product manager with more than 15 years of experience in senior engineering, global operations, and services management roles at virtualization and cloud companies. He has international experience with storage virtualization products delivery and private ... View Full Bio
We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

IT Leadership: 10 Ways to Unleash Enterprise Innovation
Lisa Morgan, Freelance Writer,  6/8/2021
Preparing for the Upcoming Quantum Computing Revolution
John Edwards, Technology Journalist & Author,  6/3/2021
How SolarWinds Changed Cybersecurity Leadership's Priorities
Jessica Davis, Senior Editor, Enterprise Apps,  5/26/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll