Handling double-digit data growth rates with single-digit budget increases is the lot of most CIOs, according to our third annual InformationWeek Analytics State of Enterprise Storage Survey. The amount of data we're actively managing continues to expand at around 20% per year, and we see a long tail of besieged IT staffs dealing with growth rates exceeding 50%. At these levels, most data centers will double storage capacity every two to three years. While nearly every company's data is growing, their IT budgets aren't always: 55% expect their IT spending to rise this year, 18% are cutting, and 26% expect it to be flat, our InformationWeek Analytics Outlook 2011 Survey finds.
What's standing between us and the vortex of doom? In a word, consolidation. We're talking the continued evolution of high-density magnetic media. Bigger, faster, and less expensive solid-state drives. Virtualization to ease management of aggregated storage pools that use available capacity more efficiently. Optimization technologies like data reduction, thin provisioning, and automatic tiering. Moving to a consolidated data/storage network won't hurt, either.
It seems paradoxical to say consolidation is the key to taming storage, an ever-expanding resource. But there's a clear trend toward packing more data onto fewer systems and streaming more information over unified networks, as well as merged vendors consolidating more products under one corporate umbrella.
Consider hardware. New 200-GB and 400-GB SSDs are on the horizon. They're not inexpensive, but for the right use case, they can make sense. Nearly a quarter of our survey respondents have deployed SSDs, an increase of 37% in the past year, with more than half planning to increase or initiate SSD use this year. Meanwhile, capabilities that were once features requiring special-purpose hardware appliances are being integrated within array controllers. Data center consolidation, which started with servers, has spread to storage, as IT architects leverage larger arrays, faster networks, and more sophisticated management software to apply economies of scale to storage provisioning. Storage virtualization is a growth area, according to our survey.
In 2010, we also saw a wave of industry consolidation, with large storage vendors continuing a trend established in other IT markets of ceding innovation, R&D, and product prototyping to nimble startups, then gobbling up those that demonstrate superior technology and customer acceptance. In fact, many of the blockbuster tech acquisitions in the past year were driven by holes in buyers' storage portfolios. No deal topped the intrigue of Hewlett-Packard's successful bidding war with Dell over 3Par, and the year wrapped up with EMC acquiring scale-out network-attached storage leader Isilon and Dell grabbing storage area network specialist Compellent.
But don't worry that innovation will stall. While the big are getting bigger, the overall storage market continues to expand, leaving more than enough space for another round of advances.
One worrisome area is security. Consolidate more data onto one system and fail to protect it, and you'll see the dark side of doing more with less. "Stored data is one of the most vulnerable parts of an organization," says Doug Davis, IS coordinator for Monical's Pizza, a Midwest restaurant chain with about 65 corporate and franchise locations. "Data at rest being captured in small bytes is one of the hardest things to control. As the head of IT, it is my job to be sure nobody is removing needles from any of the haystacks in my whole field."
Out With The Old ...
IT prognosticators have been predicting the decline of Fibre Channel for years, and given the cost and complexity of deploying FC SANs, it's a dream undoubtedly shared by many CIOs. Now, finally, the maturation of 10 Gigabit Ethernet and Ethernet-based storage protocols like iSCSI and Fibre Channel over Ethernet means these systems have the raw performance of even the latest-generation 8-Gb FC. And with improvements in iSCSI software stacks, new Ethernet-FCoE switching systems like HP's Virtual Connect, and multiprotocol LAN/SAN switches like Cisco's Nexus 5000, migrating to a converged LAN and SAN architecture is easier than ever. Our survey indicates we're in the early days of this changeover, but we expect momentum to build.
While FC remains the dominant SAN access technology for iSCSI, improved storage gear interoperability and years of iSCSI standards development and vendor wrangling to perfect the technology are paying off. Our survey shows a small but noticeable increase in usage, with nearly two-thirds of enterprises employing iSCSI for some applications and 16% using it for more than half their storage needs.
This is significant because iSCSI is a pillar of the converged data center. As the number of virtual servers in use increases, with their reliance on shared storage for server images and application data, converging the SAN and LAN promises to lower costs, increase scalability and flexibility, and make network management easier. You're basically designing and operating a single network instead of two networks.
Lisa Moorehead, network manager for the Massachusetts Department of Public Utilities, says tightening budgets are forcing many government entities toward IT centralization and consolidation. "Systems such as e-mail, HR/personnel, and many databases have been consolidated, and that storage is largely centralized on an FCoE or iSCSI SAN," she says.
Server virtualization has been a driving force behind many recent data center upgrades, whether for new equipment, a redesigned system architecture, or new applications. And this migration of IT infrastructure provisioning from the physical world to the logical, virtual domain is affecting networks and storage.
Parceling out disk space is a big headache for most virtual server administrators. The simplest approach carves shared storage pools into fixed chunks, much like partitioning a PC hard drive. These logical volumes are reserved for single virtual machines, regardless of whether the space is used. Since the volume size must accommodate an application's maximum expected data consumption, VM administrators tend to "supersize," planning for the worst.
Thin provisioning takes a "just-in-time" approach to this static allocation of storage. Much as just-in-time logistics lets Wal-Mart keep warehouse inventory to a minimum, thin provisioning lets storage systems dynamically allocate capacity in response to demand. Instead of setting a fixed size for each virtualized application, thin provisioning sets a range of storage capacity, tricking the VM into thinking it has the maximum allocation while the storage system metes out only what's actually being used.
In a virtualized environment, thin provisioning can dramatically increase storage utilization--a benefit not lost on our survey respondents. As virtualized servers become the default for enterprise applications, we expect this feature to become common in new storage systems and for adoption to become nearly ubiquitous.
Larger storage pools also have prompted adoption of virtualization technologies to simplify data management and access. Two virtualization varieties our survey tracks are storage (or block) virtualization and file virtualization.
Both techniques introduce an abstraction layer to isolate applications--either database systems or network file shares. The former virtualizes multiple physical disks to create a single, logical disk, and resembles an evolutionary extension of RAID software. But unlike RAID, many block virtualization schemes can dynamically allocate capacity, growing and shrinking volume size in response to changing needs, and they can also span multiple storage systems to create extremely large volumes. Like thin provisioning, block virtualization can greatly improve the capacity utilization of large storage pools by letting administrators granularly tailor logical volume sizes to application needs, while offering the flexibility to add more capacity without disrupting the application.
"E-mail PST files will kill your network," says Massachusetts' Moorehead. "PST files have caused network server crashes and connection outages. Virtualized servers have enabled us to reconnect staff more quickly and get the systems back online. Being able to 'move' data around to the various location sites among our domains allows for better data management, tighter disaster recovery restoration timelines, and faster system upgrades and rebuilds."
File virtualization introduces a logical layer between a file name and its network share location. Consider file virtualization the DNS of network file shares--just as Web surfers don't need to know the IP address for a particular server, with file virtualization, users don't need to know the specific network path for a given file. They merely connect to a central share point, letting the virtualization software take care of the details.
While both virtualization approaches improve the efficiency, manageability, and convenience of large storage pools, they also work with another performance-optimizing technology, automated tiering.
The trick with automated tiering is to match storage requirements with the correct tier. As the name implies, automated tiering facilitates this task by monitoring, at the storage controller, usage patterns of individual data blocks or files and automatically placing those most frequently accessed on Tier 1 devices and shifting less-used data to progressively lower, and less expensive, tiers. Making this migration transparent to users and applications requires virtualization, either block or file, since the application itself will have no clue whether a particular data block or file might have moved since the last time it was accessed.
Our survey shows that these three important optimization technologies are still in the early stages of adoption; however, virtualization is catching on, and will no doubt post bigger gains next year. Storage virtualization is used by more than a third of our survey respondents' companies, up slightly from 2009, while almost 30% use file virtualization. Furthermore, as respondents evaluate storage products, a third say they consider storage virtualization an important feature.
Lock It Up?
Even as IT organizations place security of stored data at the top of their priority lists, the percentage of survey respondents who say their companies encrypt or plan to encrypt their data at rest on backup tapes actually declined from last year.
IT has always struggled with fitting system backups into a time window, and adding encryption typically extends the time required to complete the process. As the amount of data to be backed up grows, encryption becomes a harder sell.
"Organizations in this situation need to focus on reducing the amount of data being backed up and increasing the speed in which the backups are performed," says Adam Ely, director of security for TiVo and an InformationWeek Analytics contributor. The data reduction technologies we've discussed can help. Ely also recommends taking a risk-based approach to encrypting backup tapes. If you can't protect everything, focus on high-value targets such as source code, personally identifiable information, and card-holder data.
Another area of concern: Tight IT budgets have kept some IT teams from adding managed storage capacity. If you can't afford the hardware or additional staff to make space for data, you can't store the data--at least centrally. But that doesn't mean the actual amount of company data isn't growing. Employees are likely just buying 1-TB drives for less than $100 and lashing them to their PCs, leaving much unstructured data outside IT's control.
Perhaps in response to that challenge, we saw a nearly 50% increase in the number of survey respondents planning to implement cloud-based storage services over the next year. While IT might prefer to have all data tucked away on its own servers, where that's not feasible, a central cloud storage location is preferable to a thousand individual points of possible data loss.
Storage 2011 To-Do List
Improve efficiency: Don't throw more spindles at the problem. Look for hardware with embedded features, such as deduplication and compression, so you're not storing the same data more than once. For all but the most performance-sensitive databases, seek products that can apply data reduction to Tier 1 storage.
Run the numbers: SSDs are expensive, but for transactional applications that need high storage throughput, they can be less expensive and more efficient (both in space and power) than massively parallel disk arrays.
Consider automated tiering: Manually moving data between tiers is time consuming, requiring disciplined information life-cycle management. Embedded automation software is a much better bet.
Stay out of the pool business: Create as few storage pools as possible. New arrays should be able to support multiple access protocols, both NAS (CIFS and NFS) and SAN (iSCSI and FCoE), as well as different types of storage devices (high-performance SAS, low-cost SATA, and SSD). This convergence lets IT organizations meet diverse application requirements by logically carving up capacity on one array instead of physically dedicating space on several boxes.
Don't fear the cloud: Hardware vendors are spreading FUD to keep customers locked into a lucrative (for them) business model. Sure, putting your company's financial data in the cloud is imprudent, but for archives, infrequently accessed data, or when you need higher capacity temporarily, cloud storage, wrapped with some nice information search and management software, makes sense.
Accelerate consolidation: If you haven't made the jump to an FC SAN, think twice before doing so. Although FCoE equipment interoperability isn't perfect, it's improving, and iSCSI is now a viable Ethernet SAN alternative. As you plan 10 Gigabit Ethernet LAN upgrades, think about using network virtualization to carve the same large pipe into several logical LANs and SANs.