Google and a bevy of large enterprises and cloud providers have broken the stranglehold of big enterprise storage vendors using virtualization. You can, too.
Time to face the hard truth about enterprise storage: Apart from the intelligent software embedded in the controller, that Tier 1 SAN you spent millions of dollars on is just a conglomeration of commodity components. A 15,000-rpm enterprise-class hard drive mounted on a fancy set of rails and topped with a shiny corporate logo is still just a 15,000-rpm physical disk, even if you did pay a jaw-dropping markup. Controllers are mostly based on standard appliance or x86 platforms running Windows or modified Linux variants, with custom Java-based Web interfaces for management slapped on top.
We're not saying this is a bad thing--in fact, it makes a lot of sense to use standard building blocks rather than redesign the operating system and hardware set from the ground up. And there's real value in the software stack that provides intelligence and management. But it's time to ask this question: Is there a less expensive, more flexible alternative?
"SAN storage isn't sorcery anymore," says Jake McTigue, IT manager at medical device maker Carwild and an InformationWeek contributor. "If you have disks and management software, you can make pretty much anything happen. The question you have to ask yourself is: What exactly am I paying for here?"
What indeed. Unfortunately, enterprise storage is one of the last bastions of closed design, and Tier 1 vendors are working hard to maintain the status quo. They have stellar name recognition, and they're building on a base of solid products that gave rise to the SAN revolution. EMC in particular has a very effective marketing machine, adept at impressing executives with its black boxes. Of course, it also has a track record for reliability and enterprise-class support. What no one mentions in sales presentations, however, is that shiny new features like tiering and thin provisioning have been available for years from software vendors, and that under the hood of pricey SAN gear like the Clariion line runs Windows XP Embedded or Windows Storage Server on a standard Intel platform.
5 Reasons To Adopt Storage Virtualization Now
1. No Lock-In It's easy to migrate among storage platforms with minimal IT pain and no disruption for users.
2. Standardized management: A single interface and common features let IT manage a variety of storage devices.
3. Ease of adding new functionality: Storage virtualization software eases replication, snapshots, thing provisioning, and adoption of other utilities on dissimilar hardware.
4. Accelerate and reduce load on Tier 1 SANS: By using huge amounts of commodity RAM as cache, the overall load on underlying SANs is reduced, often substantially. Wrings more performance out of existing hardware.
5. Flexible Growth: Start small by virtualizing low-end storage for archiving, then scale to include more critical systems, introducing faster drives until you reach full deployment.
An EMC spokesman didn't dispute the hardware angle but says what his company brings to the table is well-tested integration of high-end components, advanced software capabilities, and--perhaps most important to enterprise IT--a single point of contact if problems arise. "IT organizations understand that evaluating technology based on cost alone often leads to point decisions that can be cheap up front but very costly over the technology's useful life," he says, adding that out of EMC's several thousand engineers, a very small number are dedicated to hardware while the overwhelming majority are focused on software development.
Storage vendors like EMC, Hitachi Data Systems, and Hewlett-Packard also issue qualified-equipment lists that IT organizations can use to ensure their setups are supported. "Sure, you can do whatever you want with storage," says Network Computing editor Mike Fratto. "But go off the reservation, and you won't get support until you're back in line."
It's largely that desire for ironclad support that has allowed Tier 1 vendors to preserve the profit margins inherent in tightly coupled software and hardware. But our data clearly shows that, from IT's perspective, this path is unsustainable. In our 2010 InformationWeek Analytics State of Storage survey, seven of the top 10 most pressing concerns cited by respondents contained the word "insufficient." Insufficient storage resources for critical applications and departments. Insufficient budget. Insufficient tools to manage storage. Insufficient staff and training. We're spending money on Band-Aid technologies like deduplication while shortchanging areas like security and data analytics--not to mention innovation to move the business forward.
It's past time for an intervention. We've all grown comfortable with the server virtualization model, where you buy intelligent software and run your choice of hardware. Why should storage be any different? It shouldn't, and a number of vendors are providing the means to finally sever this bond. By offering flexible, standards-based storage virtualization and management software that will run on any x86 platform, DataCore, FalconStor, StarWind, Openfiler (open source), and others extend a tantalizing promise: Pay once for intelligent software and run it on the hardware of your choice. Get the software virtualization layer in place, and the possibilities are endless. Want to use legacy Fibre Channel arrays with an iSCSI-based server farm? No problem. How about presenting direct-attached SAS (serial-attached SCSI) arrays in low-cost disk enclosures across a Fibre Channel SAN? Simple. Migrate volumes between arrays or across entire systems with zero downtime? Have at it.
Or maybe your storage controllers are a bit long in the tooth? Just migrate the software to a shiny new x86 server, and you're in business. And this is just the beginning--since the software layer sits in front of your storage devices, the entire feature set is immediately available across all back-end storage, even on platforms that lack native support for replication, thin provisioning, snapshots, and the like.
What's the catch? As with any technology shift, there are pitfalls to avoid. We'll touch on them here and go into more depth in our full report, available at informationweek.com/analytics/ storageanon. The biggest issue is that, while getting all this control and flexibility in house is great, you'll need expertise in design and management.
Who Can Play?
Storage virtualization isn't just for the Googles of the world. In fact, we'd argue that every CIO should explore the feasibility of transitioning to commodity storage. There's no need to call in the forklifts. By separating software from hardware on new purchases, you gain essentially unlimited scalability plus the ability to uniformly manage--even accelerate--existing Tier 1 gear while establishing a sustainable growth path. Many companies are using storage virtualization software in front of their Tier 1 SANs to unify management, standardize continuity of operations and disaster recovery, and accelerate performance. In this scenario, you may not be saving money, but you're gaining benefits like enabling replication on dissimilar SANs at various locations. The real cost savings come with phasing in commodity hardware--you already have the full feature set in software, so just buy inexpensive arrays and make sure everything is mirrored for high availability.
Dan Lamorena, a senior manager in Symantec's Storage and Availability Management Group, says that he talks with a lot of companies that want to go this route but view it as very hard to do--and they're right.
"Google is famous for using low-end storage and processing, but everyone glosses over the millions and millions they spend in custom monitoring, failsafes, and preventative maintenance," says Michael Healey, CEO of integrator Yeoman Technologies and an InformationWeek contributor. "It's hard work. If you want to get the true cost savings of commodity gear, you need to be willing to invest mental energy and manpower."
We submit that something in between the extremes of a Tier 1 SAN and a Google-like collection of 1U boxes is not only doable for most enterprises--it's the wave of the future.
Ready To Take A Leap?
Brian Westerman, CIO of mainframe integrator Syzygy, is building a virtualized storage system from commodity gear after failing to get what he needed from big storage vendors. Still, Westerman isn't counting the Tier 1 players out. "They'll continue to get away with that pricing model until people stop being afraid to experiment with lower-cost alternatives," he says. "Lower cost doesn't necessarily mean lower reliability, but many managers are only interested in the safe choice. There used to be a saying that 'no one ever got fired for buying IBM,' and I believe it's still true. EMC is simply following IBM's model."
There's never been a better time to consider moving out of that storage safety zone. Many IT organizations face unprecedented pressure to serve up rapidly and continuously growing data sets faster than ever before while at the same time tightening their budgets. Using commodity hardware with a separate intelligent software layer, it's possible to deploy a SAN with 2 PB of fully redundant, highly available storage for less than $2 million. For a 10- to 20-times premium over the consumer baseline cost, you can get a very capable storage infrastructure with a full feature set included.
Virtualization To The Rescue
Before we dive in, let's clarify the technology in question. Like "cloud computing," "storage virtualization" has become somewhat of a catch-all term, encompassing a range of technologies, from global namespace file systems to block-level hardware abstraction. Naturally, lots of vendors are aiming at the market, even rebranding existing products capable of separating physical disk arrays from the logical presentation of volumes with the "virtual" label.
While file- and host-based systems like the open source Gluster or Symantec's Storage Foundation have their merits, our focus here is on block-level virtualization products that can cleanly sever the link between a given volume and the underlying block storage, in the same way that a server hypervisor abstracts the virtual machine OS from its physical hardware host. We view such block-level storage virtualization as the next step in the progression toward a fully virtualized data center.
In fact, in our August InformationWeek Analytics Data Center Convergence Survey of 432 business technology professionals, 57% of respondents with data center convergence plans are either using storage virtualization or plan to so within the year; an additional 13% cite a two-year adoption road map. On the server virtualization side, 63% already have deployed.
Why does that matter? We suspect that many of these companies are missing some of the substantial cost and management benefits of server virtualization because their back-end storage is still shackled to proprietary platforms, with all the associated heavy lifting and data migration headaches each time a SAN must be upgraded or replaced.
Think of these block-level software products as, in effect, storage hypervisors, bringing all of the considerable cost and management benefits of server virtualization to the storage back end. Let's say you have a variety of legacy SAN platforms or are struggling to integrate a mix of equipment as the result of a merger. Storage virtualization software can give you a centralized management interface that applies a consistent feature set to all storage and lets you avoid buying per-SAN license upgrades to enable such functions as replication and mirroring.
It's worth noting that many storage software vendors are understandably reluctant to step on the toes of the Tier 1 vendors by pushing the commodity hardware angle, since they often sit in front of enterprise SAN equipment and don't want to be perceived as direct competition. That said, there's no reason for enterprise CIOs to tread lightly. At a minimum, the commodity option is a credible bargaining chip when pushing for more aggressive pricing from your current SAN vendor. For those who need to stick with the standard SAN model, take a serious look at alternatives to the enterprise heavyweights. Vendors like Compellent and HP offer many cutting-edge features and may save you money.
Support Is Key
As with any mission-critical enterprise technology, robust support plays a critical role in a successful implementation of virtualized storage. The software vendors we've mentioned offer solid support options. For hardware components, we suggest sticking with reputable providers and ensuring that their support plans meet your requirements. If you're not careful, it's easy to end up with a dozen points of contact--one for hard drives, another for enclosures, a third for servers, and so on--which can quickly lead to migraines if something goes awry.
As for design and initial implementation, companies without in-house expertise in storage virtualization will want an experienced partner to provide input on configurations and best practices, as well as setups to avoid. Although hardware-independent storage virtualization is inherently flexible, just because you can do something doesn't mean you should.
And even though it would be easy to just hand this project to the storage team, don't. Include your server folks as well. The chasm between servers and storage began to shrink with server virtualization, and abstracting storage hardware continues the convergence. Features available at multiple layers in the virtualization stack will require common guidelines and coordination. For example, do you take a snapshot of a virtual server within the hypervisor, or a snapshot of the volume from the SAN software? And with thin provisioning, server admins need to be aware of the relationship between the advertised storage and the underlying physical capacity so they can help manage utilization and avoid over-allocations.
With new streamlined interfaces and the overlap of server and storage virtualization technologies, skill sets and the distinction between these teams will continue to blur. Given the movement toward data center convergence, that's a healthy trend that CIOs need to encourage.
6 Key Features To Watch
When architecting a virtualized storage array, there are availability, performance, and complexity issues to consider. We discuss those in depth our full report. Storage virtualization software vendors also vary in how they implement features, including:
>> High availability and mirroring: As you evaluate storage virtualization software, it's critical that the high-availability (HA) code is absolutely reliable. The software should be able to transparently handle failover/failback and arbitrate access to identical copies of data, at minimum.
>> Remote replication:Insist on the flexibility to replicate to different devices and types of storage. Pay attention to the robustness of the replication transport itself. Will it support bandwidth throttling, scheduling, and encryption? How easy is it to reverse the replication direction once you're ready to fail back? Some vendors attempt to incorporate these features internally, while others rely on optimized WAN appliances. Either way, make sure you're comfortable.
>> Thin provisioning: By letting admins advertise more storage than is physically available, thin provisioning provides flexibility and reduces the need to buy extra storage in advance. It's possible to move toward a "just-in-time" model, where IT tracks actual physical storage utilization and buys additional disks only when they're needed. But don't get carried away--float too many "storage checks" to your applications in a SAN environment and you may end up with an overdrawn storage pool, and that leads to big problems for your users, since writes will be denied until more storage is available.
>> LUN pass-through: Some products are able to act as front ends to your existing storage and then pass that LUN through to application servers. If you're looking for an easy way to migrate from existing storage, this may be the best method.
>> Manual and automated tiering: Systems that support multiple tiers let organizations buy various grades of storage and allocate the appropriate type to applications based on performance requirements. We consider manual tiering a must-have; automation is a nice touch but no magic bullet.
>> Data deduplication: Deduplicating data on production storage can be tricky, given varying data types, hash collisions, and data integrity issues. Performance is a concern, so proceed with caution, and do pilot tests with your own data before signing a purchase order.
Watch The Hype
Storage virtualization technology has matured. Many, though not all, of the software offerings we've discussed are ready for even mission-critical data.
Again, think about VMware's trajectory: Server hypervisor technology was around for years, but it was slow and relegated to test labs until commodity hardware got fast enough to run even the biggest production loads without breaking a sweat. Now, relatively inexpensive, networkable "dumb" storage hardware is readily available. Fibre Channel, iSCSI, and SAS arrays without any expensive added features are ideally suited for this initiative. SAS in particular is maturing and enables massively scalable direct-attached storage setups that can eliminate expensive Fibre Channel for small companies. Standard commodity drive equivalents can match even the highest-performance enterprise gear.
As with any product category, storage software sales teams tend to promise the world and deliver less once the engineers get involved in post-sales design work. Grill your rep or ask for an engineer to participate in discussions, and if at all possible, have your own storage architects pilot a trial copy of the product. If a vendor balks at this request, caveat emptor. Focus on the stability of the platform and its ability to reliably deliver real-world performance while integrating with the gear you already own.
Staying with the current paradigm is sustainable only until the unstoppable force of data growth runs smack into the immovable constraint of funding. What happens then? Our money is on the data. To stay competitive, companies of all sizes must explore creative, cost-effective storage alternatives.
Steve McMurray is an IT consultant specializing in virtualization and storage architecture design as well as Active Directory security, Group Policy, and other Microsoft technologies. Write to us at email@example.com.
Google in the Enterprise SurveyThere's no doubt Google has made headway into businesses: Just 28 percent discourage or ban use of its productivity products, and 69 percent cite Google Apps' good or excellent mobility. But progress could still stall: 59 percent of nonusers distrust the security of Google's cloud. Its data privacy is an open question, and 37 percent worry about integration.