InformationWeek Stories by George Crumphttp://www.informationweek.comInformationWeeken-usCopyright 2012, UBM LLC.2013-01-02T09:07:00ZWhat Is Software-Defined Storage?Is it really a new product or simply a reframing of existing storage technology? Actually, it's a little bit of both.http://www.informationweek.com/what-is-software-defined-storage/240145344?cid=RSSfeed_IWK_authorsWith increasing frequency, I'm asked for my thoughts on the emerging software-defined storage category. Whenever I'm presented with a new tech term, I ask whether it truly defines a new product category, or if it's simply an attempt to make an existing technology seem more glamorous. Software-defined storage is a little bit of both. If we really have to have a separate term for this group of products, here is what I think that definition should be. <P> We have been using software to define storage for as long as there has been storage. One could take the stance that a volume-manager application is essentially software defining storage. But those promoting the current term clearly have more in mind. You could also easily lump anything that does storage virtualization into this category, and we are seeing many of the storage virtualization vendors do just that. <P> For me, though, there is a difference. Both storage virtualization and software-defined storage abstract the storage services from the storage system, allowing them to provide those services across a variety of disk and solid-state storage systems. Storage virtualization, however, should be isolated to products that must run on a dedicated piece of hardware. For many vendors, this is a purpose-built appliance; for others, it is software that you load on a dedicated server. <P> I don't think there is anything controversial about this separation thus far. However, I would refine the above to also include products that require their software to be run as a dedicated virtual machine. The fact that your appliance is virtual does not mean it does not require an appliance; it simply means that it does not require hardware. It is essentially virtualized storage virtualization. That said, storage appliances running virtually can be seen as an improvement over dedicated external devices, as they bring storage performance and costs in lockstep with the scaling of the virtual infrastructure. <P> This means then that software-defined storage is storage software that is an extension of the existing operating system or hypervisor and does not require a specific virtual machine to run its software in. As we discuss in "<a href="http://www.storage-switzerland.com/Articles/Entries/2011/9/26_The_Storage_Hypervisor.html">What is the Storage Hypervisor?</a>" this means that either the operating system / hypervisor provider or (via extension) a third party has added features like thin provisioning, snapshots, cloning and replication. At that point all that is needed from the physical storage hardware is a reliable design and potentially high availability. <P> For the IT professional this is more than just a discussion of semantics. Each has its place and can bring significant value to the enterprise. As the data center becomes increasingly virtualized, software-defined storage and virtualized storage virtualization becomes an ideal method for scaling storage capacity and performance as the virtual environment scales. Until that time, storage virtualization running on dedicated hardware provides the benefits of software-defined storage across both virtualized and non-virtualized platforms.2012-12-27T09:06:00ZHow To Choose A Data Archiving PlatformNo matter what your motivation for archiving data, the storage system needs to provide data integrity, scalability and power management.http://www.informationweek.com/storage/systems/how-to-choose-a-data-archiving-platform/240145234?cid=RSSfeed_IWK_authors<!-- KINDLE EXCLUDE --> <div class="inlineStoryImage inlineStoryImageRight"><a href="http://www.informationweek.com/cloud-computing/infrastructure/7-cheap-cloud-storage-options/240134947"><img src="http://twimgs.com/informationweek/galleries/automated/905/01_Cloud_tn.jpg" alt="7 Cheap Cloud Storage Options" title="7 Cheap Cloud Storage Options" class="img175" /></a><br /><div class="storyImageTitle">7 Cheap Cloud Storage Options</div><span class="inlinelargerView">(click image for larger view and for slideshow)</span></div> <!-- /KINDLE EXCLUDE -->At the end of the year, the topic of data archiving heats up. My <a href="http://www.informationweek.com/storage/data-protection/find-the-right-data-archive-method/240144127">last column</a> covered different methods for moving data to that archive. This column we will take a look at storage systems that want to be your repository for storing this information. In general, no matter what your motivation for archiving data, the archive storage system needs to provide data integrity, scalability and power management -- and, of course, do so at competitive pricing. <P> There are several types of devices that you can archive to. The first and one that might be overlooked is a big disk array. Although these often don't have the capabilities to do continuous data verification and might not have the large scaling capabilities that other, more archive-specific systems do, they do have one big advantage: Price. These systems tend to be very cost effective if your archive requirements won't reach the limits of a single array. A few of these systems also have very mature power-saving capabilities such as spin-down drives. <P> Another option outside of traditional archive storage systems is cloud storage services. Cloud has the advantage of not taking up any of your data center footprint and never running out of capacity. Some cloud providers via third-party archive solutions also can provide complete data integrity checking. They also, of course, have the advantage of a pay-as-you-go license, so the upfront investment is minimal. The downside to these systems is that they are pay-as-you-grow as well. You keep paying and paying. Storing terabytes and terabytes of information in the cloud for decades could be very expensive over time. <P> <strong>[ Wondering how cloud storage differs from online? Read <a href="http://www.informationweek.com/byte/personal-tech/consumer-services/online-backup-vs-cloud-storage/240142299">Online Backup Vs. Cloud Storage</a>. ]</strong> <P> There is the option to build your own cloud storage system in house; in other words, a private cloud. As I recently described in my article <a href="http://www.storage-switzerland.com/Blog/Entries/2012/11/30_What_Is_Object_Storage.html">"What is Object Storage,"</a> most of these systems tend to use an object file layout. This gives them tremendous scalability and consistent performance even as the amount of archive data increases. Leveraging an object layout also provides the foundation for doing continuous data verification. <P> These systems also tend to scale one node at a time, providing a similar pay-as-you-grow capability. Unlike the cloud, though, you own it. This has its pros and cons. There is also the challenge that you have to store all your data on disk. That means these systems need to be powered and running in order to operate. Few scale-out object storage systems have developed the capability to "spin-down" nodes. <P> Finally, there is tape. Tape wins hands down for price competitiveness and for power efficiency. The above technologies all provide near-instant retrieval. Tape does not. But you have to ask yourself, if a request comes in for data that is 10 years old do you really need to recover it in seconds? Or can it wait a few minutes for the tape to be loaded into a tape drive, found and then recovered? If that is the case then tape might be for you. <P> Another concern about tape is data integrity. As we discussed in our webinar <a href="http://www.storage-switzerland.com/BT4ReaTape.html">The Four Reasons The Data Center is Returning To Tape</a>, tape cartridges have actually been proven to be more reliable than a disk drive but they don't have the built-in data integrity checks that some of the above methods do. However, some archiving solutions that support tape provide the ability to perform scheduled scans of tape drives so that integrity can be assured. <P> So, which one to pick? Most vendors mistakenly look at the archive target as a zero sum game. It all must be on their hardware. We find that most data centers are better served by a mixed approach that leverages two or more of the above solutions: Use disk for the medium-term archive of data, and tape for the long-term deep archive. In fact, in an upcoming column I'll discuss how to leverage tape with either a private or public cloud.2012-12-13T09:06:00Z3 Fixes: Hybrid SSD Array Performance GapHybrid SSD arrays are ideal for many data centers but have performance weaknesses. Use these options to boost speed.http://www.informationweek.com/storage/systems/3-fixes-hybrid-ssd-array-performance-gap/240144293?cid=RSSfeed_IWK_authors<!-- KINDLE EXCLUDE --><div class="inlineStoryImage inlineStoryImageRight"><a href="http://www.informationweek.com/cloud-computing/infrastructure/7-cheap-cloud-storage-options/240134947"><img src="http://twimgs.com/informationweek/galleries/automated/905/01_Cloud_tn.jpg" alt="7 Cheap Cloud Storage Options" title="7 Cheap Cloud Storage Options" class="img175" /></a><br /><div class="storyImageTitle">7 Cheap Cloud Storage Options</div><span class="inlinelargerView">(click image for larger view and for slideshow)</span></div><!-- /KINDLE EXCLUDE -->To solve performance problems, many data centers are turning to the solid state disk (SSD), but there is a question about how to best leverage this high-speed device in new systems often referred to as hybrid SSD storage systems. <P> The emerging implementation method is to use caching technologies to automatically move the most active subset of data to the more premium storage tier, and combine that with near-line SAS hard disks to storage the least-active, least-performance-sensitive data. In theory, this should give the perfect balance of price and performance. But there is a performance gap to be aware of. <P> The concern with these systems is that in an effort to hit an ideal price-performance number, there is risk of a significant potential performance drop because of a cache miss. The performance delta between the two tiers -- SSD vs. near-line hard drives -- is large on a per-drive basis. <P> The situation is more severe for these newer systems than legacy enterprise systems. These new systems often use 3-TB high-capacity drives; they are slower than enterprise HDDs; and there are going to be fewer of them. The fewer the hard drives, the worse overall performance will be. Enterprise systems are designed to house plenty of hard drives. These new hybrid systems often are not. <P> <strong>[ Read <a href="http://www.informationweek.com/storage/systems/measuring-the-state-of-primary-storage-d/240142331?itc=edit_in_body_cross">Measuring The State Of Primary Storage Deduplication</a>. ]</strong> <P> As a result, when there is a cache miss, the dropoff to that second tier might be severe and it might have a significant impact on storage system response time. Obviously, any storage system with a mix of SSDs and hard drives would have this problem, but the gap is more severe in the newer hybrid systems. This performance gap can be especially painful if there are a lot of cache misses caused by either too small of a cache investment or a data set that is not cache friendly. <P> There are three ways to address this performance gap in hybrid SSD arrays. <P> <strong>1. A Big Cache</strong> <P> The simplest way to cover the performance gap is to make sure that the vendor has the capability to implement a big cache so that cache misses are a rarity. This also requires understanding your data to make sure you know how much of your data is active. You need to know this information not only for a given moment in time but over a span of time as well. The ability to trend data activity is critical to any SSD investment and is the subject of our upcoming webinar, <a href="http://www.storage-switzerland.com/BT3BuySSD.html">"Three Things To Know About Your Environment Before Buying SSD"</a>. <P> Another way to make the cache "bigger" is to efficiently use deduplication technologies. As we discussed in our recent article, <a href="http://www.storage-switzerland.com/Articles/Entries/2012/8/23_Resolving_The_%24_per_GB_Problem_of_SSD_in_Virtual_Environments.html">"Resolving The $ per GB Problem of SSD in Virtual Environments"</a>, deduplication technologies when implemented inline before data gets to the SSD tier can effectively make raw SSD capacity five times more efficient. This means that in the right environments, 1 TB of SSD can act like 5 TB. Not only does this help resolve the cost premium of SSD, it also makes the chances of a cache miss less likely because the cache is not storing redundant copies of data. <P> One downside is that most -- but not all &#8211; cache-based environments do not cache writes. Writes drop down to the HDD layer, which again might be slow to respond thanks to the low number of high-capacity drives. How much this affects you will be largely dependent on your environment. <P> <strong> 2. Cache Pinning</strong> <P> Another capability to look for is the ability to "pin" data to cache. This means that you as the user can select key files that you absolutely want to make sure are served from SSD and lock that data to that tier. One thing to beware of with a pinning feature is that, as mentioned above, most cache systems do not cache write I/O, even if the file associated with that I/O has been pinned to cache. In these systems, writes will still drop to the HDD layer. Again, knowing what data in your environment would be the most logical to pin to cache is critical to using this feature efficiently. <P> <strong>3. A Better HDD Layer</strong> <P> The final capability to look for is the ability to increase the number of hard drives that the second tier has available. This can be done with external shelves, a scale-out cluster, or using smaller 2.5" drives in the primary chassis. Most environments are not capacity constrained at the hard drive level. Instead of having excess capacity that you will probably never use, get smaller, less expensive, less power-consuming 2.5" drives. Doing so brings greater redundancy, flexibility and better performance. <P> Hybrid SSD arrays are ideal for many data centers, but there are a few weaknesses to understand and address. The way cache works and the ramifications of a cache miss are probably at the top of the list. For some environments a cache miss to the hard drive layer is not going to be a problem; for others, it can cause erratic performance that might be unacceptable. If this is a concern then a larger cache or larger hard drive layer -- in terms of spindle count -- might be the better option. Or it might be that an SSD-only appliance makes more sense. The key is to know your application and understand how it is reading and writing to storage.2012-12-11T09:06:00ZFind The Right Data Archive MethodBackup-as-archive is an increasingly viable storage solution, especially for companies that don't have strict data retention requirements.http://www.informationweek.com/storage/data-protection/find-the-right-data-archive-method/240144127?cid=RSSfeed_IWK_authorsIn almost every study I've done and seen, one fact remains consistent: at least 75% of the data that's stored on primary storage has not been accessed for more than one year. <P> This data really should be archived. Products that move such data transparently to an archive have improved dramatically in recent years, and it may be time to reconsider data archiving. In this column I'll look at the competing methods for archiving data; in my next column I'll look at some of the competing archive targets. <P> <strong>Backup As Archive</strong> <P> While not the ideal location, backup is <em>the</em> archive for many companies. Archive purists will not agree with me, but I believe backup products can in some cases solve the archive need, especially for companies that don't need to meet government regulations or other requirements on retaining data. Backup may also be the most realistic way to archive data since most companies are already doing it. As I discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2011/7/13_When_Does_Backup_Archiving_Make_Sense.html">this article</a>, many organizations count on backups for long-term data retention instead of using a separate archive product. <P> <strong>[ How to choose a virtualization-friendly backup app: See <a href="http://www.informationweek.com/storage/disaster-recovery/big-backup-challenge-of-medium-size-data/240143946?itc=edit_in_body_cross">Big Backup Challenge Of Medium-Size Data Centers</a>. ]</strong> <P> One reason backup archiving has lately gained legitimacy is that backup software can now create larger meta-data tables (data about the data in the backup) and can better search that data. Some products now even offer content search capabilities. Improvements in backup products' scalability are another reason the backup-as-archive approach is more practical than it has been. <P> The key limiting factor for disk backup products has not been how many disks they can add to the shelf, but how far their deduplication tables scale. This is another meta-data issue. One approach we've seen vendors take is to segment their deduplication table into multiple tables as the data ages. This lowers deduplication effectiveness, but allows for longer storage without impacting current backup performance due to lengthy deduplication table lookups. Eventually, though, deduplication engines will need to be improved in order to scale, as discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2012/12/6_Does_Backup_Need_A_Better_DeDupe.html">this article</a>. <P> One thing we don't typically see in the backup-as-archive method is the problem cited above: removal of data from primary storage. Backup-as-archive is best for companies that are less concerned with how much data they are storing on primary storage and primarily need a way to retain information in case they need it later. <P> <strong>Archive As Archive</strong> <P> Because backup as a long-term retention area is becoming more viable, archive solutions are taking a different approach. Just as solutions that move data from primary storage to archive storage are improving, so is the ability to browse the archive independently of a specific archive application. Most archives now simply show up as network mount. They also have the ability to leverage tape and disk for excellent restore performance and maximum cost-effectiveness. <P> The key to archive success is to move it upstream, where it can play a more active role in primary storage. Because of the high level of transparency and fast recovery time, archiving data after 90 days of data inactivity will likely have no impact on productivity -- and maximum impact on cost reduction. <P> There's a lot to be gained by removing 75% or more of your data from primary storage: backups will get faster and investment in higher-speed storage (SSD) for the remaining data can be justified. Data integrity will also improve since most archive solutions perform ongoing data integrity checks, protecting you from silent data corruption (bit rot). <P> In my next column I'll look at some of the products that are competing for your archive dollars: disk appliances, object storage systems, cloud storage providers, and, of course, tape. <P> <i>Storing and protecting data are critical components of any successful cloud solution. Join our webcast, Cloud Storage Drivers: Auto-provisioning, Virtualization, Encryption, to stay ahead of the curve on automated and self-service storage, enterprise class data protection and service level management. <a href="https://www.techwebonlineevents.com/ars/eventregistration.do?mode=eventreg&F=1005242&K=STOEAIBM">Watch now or bookmark for later</a>.</i> <P>2012-12-07T09:06:00ZBig Backup Challenge Of Medium-Size Data CentersHeavy on virtualization, but stuck with a legacy backup solution? Here's how to choose a virtualization-friendly backup app.http://www.informationweek.com/storage/disaster-recovery/big-backup-challenge-of-medium-size-data/240143946?cid=RSSfeed_IWK_authors<!-- KINDLE EXCLUDE --><div class="inlineStoryImage inlineStoryImageRight"><a href="http://www.informationweek.com/cloud-computing/infrastructure/7-cheap-cloud-storage-options/240134947"><img src="http://twimgs.com/informationweek/galleries/automated/905/01_Cloud_tn.jpg" alt="7 Cheap Cloud Storage Options" title="7 Cheap Cloud Storage Options" class="img175" /></a><br /><div class="storyImageTitle">7 Cheap Cloud Storage Options</div><span class="inlinelargerView">(click image for larger view and for slideshow)</span></div><!-- /KINDLE EXCLUDE -->Data centers of all sizes struggle with securely and reliably protecting their data, but the medium-size data center might have the most unique set of challenges. These organizations tend to be heavily virtualized, have very dense virtual machines to host ratios, and be very dependent on their applications to drive the business. They also tend to be the tightest on IT staff and on dollars. <P> These organizations are often referred to as small- to medium-size businesses (SMB) or small- to medium-size enterprises (SME). I find both these terms too broad because they can range from a very small business with no servers to a relatively large business with dozens of servers. Also, many of the data centers in this group are local, state and federal agencies, so they don't typically fit the standard business mold. <P> In general the medium-size data center has dedicated servers, most of them virtual, performing tasks such as email, collaboration and file sharing. In most cases they have a database server running a few off-the-shelf applications they have customized to some extent. They have shared storage typically on iSCSI SAN or a NAS running NFS to host their virtualized images. This medium-size data center could also be a computing pocket within a very large organization that for practical reasons needs to manage its own IT resources. <P> <strong>[ Read <a href="http://www.informationweek.com/storage/virtualization/storage-virtualization-gets-real/240142882?itc=edit_in_body_cross">Storage Virtualization Gets Real</a>. ]</strong> <P> Although these organizations tend to struggle with the storage systems supporting their virtual infrastructure, data protection seems to be the harder problem to address. This is probably because they are not yet at the point where they are generating enough storage I/O requests to justify a larger, solid state disk (SSD)-heavy, enterprise-class system. In a recent <a href="http://www.storage-switzerland.com/Blog/Entries/2012/6/19_Does_The_SMB_Need_SSD_Performance.html">test drive</a> we showed that intelligently adding a little SSD could improve performance and these organization are fine with that. <P> Backup and data protection is another challenge altogether. Again these organizations are heavily virtualized, short on staff, and in many cases don't have a second site to replicate data for disaster recovery. They often have started with legacy backup solutions or the backup solutions that came with their operating systems. The problem is they are too heavily virtualized for these types of applications and could benefit from an application that is more virtualization aware. Enterprise backup applications do a good job of this but tend to be too complex and too expensive for this environment. As a result, many of these companies turn to virtualization-specific data-protection products. <P> There are three key things that the medium-size data center should be looking for when it comes to selecting a backup application for the virtualized market. First, can it afford the app? It really doesn't matter how great the features are if there is not enough budget to get the product. I suggest talking price first before downloading and installing anything. <P> Second -- and still before installing -- understand what the application's capabilities are for getting the data off site. This is especially important if you don't have a secondary site to send data to. Does the software product that you are considering have the ability to leverage a cloud storage facility to send data to? And if so, is this a service you can leverage your relationship with to use the cloud for other purposes? <P> Third, is the product easy to use and does it have the features you need to accomplish the task at hand? This part does require downloading and installing a trial of the product. The good news is that in the virtualization-specific backup space, downloadable trials seem to be the common distribution method. But this is also why the first two suggestions above are so important; you don't want to and probably don't have time to try every single product on the market. <P> Certainly there is more to selecting a backup solution for the medium-size organization, but the above is a good start. We discussed many of the remaining issues that need to be considered in our recent webcast <a href="http://www.storage-switzerland.com/BT4VirtSMB.html">The 4 Headaches of Backing Up The Virtualized SMB</a>.2012-11-29T11:36:00ZManaging The Multi-Vendor BackupBackup management applications go a step beyond monitoring, but they remain limited. It's time to develop a framework-driven approach.http://www.informationweek.com/storage/data-protection/managing-the-multi-vendor-backup/240142885?cid=RSSfeed_IWK_authorsIn recent columns I have covered the challenges of <a href="http://www.informationweek.com/storage/disaster-recovery/one-backup-app-for-enterprise-not-here-y/240009534">consolidating to a single backup application</a> for the enterprise. In short, no single application can do it all, and the capabilities of application-specific backups are still too compelling. I've also discussed <a href="http://www.informationweek.com/storage/data-protection/consolidation-at-the-disk-backup-applian/240044428">consolidating to the backup appliance</a>, but this leaves gaps in monitoring and managing the mixed environment. There is software available to provide this management overview. In this column I'll discuss what to expect -- and what not to expect -- from these products. <P> There is a big difference between managing and monitoring. Monitoring basically lets you know that something is wrong, but to fix whatever went wrong means launching the offending application's GUI. Also, tasks like adding new clients, scheduling and performing restorations can't be done from the monitoring product. An application that manages the environment does more than simply let you know that something is wrong -- it also lets you fix the problem directly from the management application. This includes the ability to add new clients, change backup schedules and execute restores. A management application can eliminate the need for the backup administrator to be an expert at the various backup applications' interfaces. <P> <strong>[ Before you can assess a vendor's storage deduplication ability, you need to understand the process. Read about it at <a href="http://www.informationweek.com/storage/systems/measuring-the-state-of-primary-storage-d/240142331?itc=edit_in_body_cross">Measuring The State Of Primary Storage Deduplication</a>. ]</strong> <P> There are several good monitoring applications available that can give you a dashboard that shows how your various backup applications are running. If you are running multiple enterprise backup applications, these programs can alert you to problems and errors. Some even provide a help desk workflow capability. But as I discussed in <a href="http://www.storage-switzerland.com/Blog/Entries/2012/2/6_Managing_The_VMware_Data_Protection_Problem.html">Managing The VMware Data Protection Problem</a>, many of these monitoring solutions have largely ignored the VMware-specific backup products. This is particularly problematic because selecting a backup product for the virtualized environment is a key point of backup application splintering. A few companies are now closing this gap by providing monitoring intelligence across both enterprise applications and virtualization-specific backup products. <P> Applications that monitor backup are mature and provide a sense of control over the mixed backup application problem. They are also relatively cost-effective and easy to install. However, since most lack the ability to do anything beyond monitoring, specific changes need to be made through the application. That means that the backup administrator must learn each individual application's interface and terminology. <P> Management applications attempt to go a step further by actually interfacing with the backup application, either through a series of APIs or, more commonly, by controlling it through the command line. The problem with management applications is that they support a limited number of applications, and they exert a relatively low level of control over the applications. They also tend to be expensive. Few management applications can control all of an enterprise's backup applications, which explains why monitoring programs are more popular: they are more complete and more cost-effective. <P> In the webinar <a href="http://www.storage-switzerland.com/BT4EntBackup.html">The Four Things That Are Breaking Enterprise Backup</a>, I discussed the need for an alternative to current backup monitoring and managing products. These framework products would turn backup applications into service engines, allowing different products to share capabilities. For example, if the developer of an application-specific backup product wants to add tape support, they could get it from this framework instead of developing it themselves. <P> We're starting to see the beginnings of such framework-driven products now, from larger IT suppliers with multiple backup hardware and software products. The first step is to use these products to better integrate and leverage their backup investments. Over time, they could open up and allow other vendors to leverage these frameworks. <P> <i>From SDN to network overlays, emerging technologies promise to reshape the data center for the age of virtualization. Also in the new, all-digital <a href="http://www.informationweek.com/nwcdigital/nov12?k=axxe&cid=article_axxt_os">The Virtual Network</a> issue of Network Computing: Open Compute rethinks server design. (Free registration required.)</i>2012-11-20T09:06:00ZMeasuring The State Of Primary Storage DeduplicationBefore you can assess a vendor's storage deduplication ability, it's important to understand the challenges of the deduplication process.http://www.informationweek.com/storage/systems/measuring-the-state-of-primary-storage-d/240142331?cid=RSSfeed_IWK_authorsI've often been asked to rank various vendors' primary storage deduplication capabilities. This is a dangerous prospect as it is clearly subjective. But I can provide some ideas on how to measure each vendor's primary storage abilities so you can weigh those ideas in order of importance to your data center. <P> First, however, we need to discuss the risks of deduplication. <P> All deduplicated data has some risk -- after all, you can't get storage for nothing. Deduplication works by segmenting incoming data and creating a unique ID for each segment. These IDs are then compared to other IDs. If there is a redundancy, the redundant data is not stored -- instead, a link is established to the original segment, thereby saving you capacity. <P> All the IDs are stored in a meta-data table. This table is essentially a roadmap showing what segments belong to which data so that data can be reassembled when requested. If this table is somehow corrupted, you have more than likely lost the map to your data. Even though the map is still there, you can't access it, at least not easily. <P> <strong>[ Keep your VM data local and protect it too. Here's how. <a href="http://www.informationweek.com/storage/systems/are-san-less-storage-architectures-faste/240077534?itc=edit_in_body_cross">Are SAN-Less Storage Architectures Faster?</a> ]</strong> <P> The size of the meta-data table is a concern for deduplication systems. Each new segment represents an entry into that table, and each redundant segment represents a branch to the tree. The size of the table can cause problems, especially when you consider the speed of accessing and updating it. <P> Think of the meta-data table as a relatively simple database that needs to be able to be updated and searched quickly. This is especially important in primary storage because you don't want write performance to be impacted while the table is searched for redundancy. To avoid this problem, most vendors place their table in RAM. However, in the case of a large primary storage system that houses dozens -- or even hundreds -- of TBs of information, the entire table can't fit in memory. To get around this problem, the table is split between RAM and disk. The problem there is that a deduplication process is not cache-friendly, where a first-in first-out method of using RAM would generate viable hit rates. To overcome this, some vendors deploy their tables in flash; others process deduplication as part of an off-hour process rather than performing the function in real time. <P> Understanding deduplication meta-data is an important first step in assigning a grade to a vendor's deduplication efforts. Most of the problems described here can be overcome, as discussed in this <a href="https://www.brighttalk.com/webcast/5583/49481">recent webinar</a>. <P> In my next column, I'll discuss what to look for in a vendor to ensure that your primary storage deduplication technology is safe, fast and scalable. <P> <i>The next steps in going virtual up and down the stack, from network to desktop: Automation and finally taking hypervisor security seriously. The two go together, because if you're going to trust production systems to run without human intervention -- a must for delivering IT services on demand -- you'd better be darn sure attackers can't gain control. Get our <a href="http://reports.informationweek.com/abstract/25/9088/Virtualization/research-2013-virtualization-management-survey.html?k=axxe&cid=article_axxe">2013 Virtualization Management Survey</a> report now. (Free registration required.)</i>2012-11-12T09:06:00ZAre SAN-Less Storage Architectures Faster?Keeping VM data local will boost performance -- but you need to make sure it's protected. Here's how.http://www.informationweek.com/news/240077534?cid=RSSfeed_IWK_authorsAs I discussed in a <a href="http://www.informationweek.com/storage/systems/storage-that-only-looks-like-a-san/240008478">recent column</a>, an ever-increasing number of solutions now provide a SAN-less infrastructure for virtual environments. Initially these SAN-less designs were sold on the concept of simplicity (no SAN, fewer headaches) but now several vendors claim a performance advantage by leveraging internal SSD as well. <P> At first the performance advantage might seem obvious, since the storage is directly inside the hosts and I/O does not need to traverse a network. But that is not always the case -- it depends on how the SAN-less storage system is architected. <P> Many SAN-less designs stripe a file across all the hosts in the virtual cluster, much like a scale-out storage system stripes data across its nodes. Essentially this is a virtualized scale-out storage system with a VM running on each host, acting as a storage node. While this method does bring high availability to the design, it also introduces a network. If the SAN-less design stops here, it will have the same network-related performance limitations as a shared storage system has. <P> <strong>[ For more on shared vs. local storage, see <a href="http://www.informationweek.com/is-shared-storages-price-premium-worth-i/240008024?itc=edit_in_body_cross">Is Shared Storage's Price Premium Worth It?</a> ]</strong> <P> Many SAN-less systems take a different approach, so that the data needed by a Virtual Machine (VM) on a host is completely intact on that host -- that way it does not need to go across a network to fetch data. Many of these designs leverage internal PCIe-based flash storage or internal solid state drives (SSD). If the VM can get all its data from an internal flash memory storage device, performance will be excellent. <P> The challenge with this approach is making sure that VM data is protected if there is a failure inside the server so that it is available if the VM is migrated to another server. There are three basic techniques to ensure its availability while taking advantage of local performance. <P> The first is a drop-through technique, in which the data is still stored on the local solid state storage device and is then written through to a shared storage device in real time. Basically, this is a split write, or mirrored, approached. This method can have some latency on write traffic but should realize excellent performance on reads. It also obviously re-introduces a SAN and eliminates the advantage of being SAN-less. <P> More common is the second technique, discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2012/4/18_Building_The_SAN-Less_Data_Center.html">Building The SAN-Less Data Center</a>. This approach keeps all the data on the PCIe Flash Card/SSD in the host, as described above, but replicates a second copy to the other hosts in the striped fashion that scale-out storage systems use. This way, all the hosts have access to the secondary copy for migration in case of failure of the primary host, without needing a secondary SAN. Also, once the migration occurs, data can be re-copied into the new host so that access is again local. <P> Finally, there is the newest technique, discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2012/9/20_The_Benefits_of_a_Flash_Only,_SAN-less_Virtual_Architecture.html">The Benefits of a Flash Only, SAN-less Virtual Architecture</a>. <P> Each of these options should deliver excellent performance, especially since all reads would come from the locally attached storage. But each requires dual copies of data, and while not officially a SAN, they do require some form of a specialized network. The question remains: Is this performance better than a properly configured SAN, and if so, is it worth the limitations? I'll discuss that in my next column. <P> <i>From SDN to network overlays, emerging technologies promise to reshape the data center for the age of virtualization. Also in the new, all-digital <a href="http://www.informationweek.com/nwcdigital/nov12?k=axxe&cid=article_axxt_os">The Virtual Network</a> issue of Network Computing: Open Compute rethinks server design. (Free registration required.)</i>2012-11-05T15:13:00ZConsolidation At The Disk Backup ApplianceWith a few enhancements, such as tape support and improved reporting capabilities, backup appliances could become the perfect solution for consolidating data protection.http://www.informationweek.com/news/240044428?cid=RSSfeed_IWK_authorsI spoke at several sessions at <a href="http://poweringthecloud.com/home">SNW Europe</a> last week, and one session was about next-generation data protection. Attendees asked an interesting question, which ties into my <a href="http://www.informationweek.com/storage/disaster-recovery/how-to-develop-multi-vendor-backup-strat/240010596">last column</a> on challenges with backup software consolidation: How do we consolidate the data protection process? <P> Most agree they can't get data protection done with one backup application. It may turn out that the most viable option is for consolidation to occur at the backup appliance. <P> As discussed in my last column, there are three basic ways to consolidate data protection. First, you can centralize on a single enterprise backup application, which might not give the best protection possible for every application but provides a single point of backup management. Second, you can purchase a management application that provides management and monitoring of multiple backup applications. Third, you can have multiple applications back up to a single device. In this column, I'll discuss the idea of consolidating backups to a single appliance. <P> <strong>[ Patchwork backup systems are all too common in many enterprises, making data protection more expensive and time-consuming than it should be. Read more at <a href="http://www.informationweek.com/storage/disaster-recovery/one-backup-app-for-enterprise-not-here-y/240009534?itc=edit_in_body_cross">One Backup App For Enterprise: Not Here Yet</a>. ]</strong> <P> Most backup appliances today are disk-focused systems. Historically, the value of these systems has been to provide deduplication and drive down the cost of disk-based backup, striving to offer parity with tape. They also allowed multiple backup applications to write to them at the same time. Now they are expanding their value by including integration to specific backup applications. <P> This gives the backup application greater control over the disk backup appliance. For example, the backup application can control the deduplication appliance's replication function. This allows it to trigger which backup jobs are replicated to the remote site and when they are triggered. In some cases it can also pre-seed the indexes at the DR site so that a preconfigured backup server is instantly ready to begin restoring data. <P> Another valuable feature is the ability to distribute the deduplication function, as discussed in my recent article, <a href="http://www.storage-switzerland.com/Articles/Entries/2012/8/13_Beyond_The_Backup_Window.html">Beyond The Backup Window</a>. This allows the backup application to perform a pre-flight check of data before sending it over the network to the disk backup appliance. In most cases, the backup application does a lightweight redundancy check prior to sending the data. Essentially it eliminates the obvious duplicates and lets the disk appliance do the lower-level redundancy check. This makes the backup server work a little harder but lightens the load on the network and on the disk backup appliance. <P> An increasing number of enterprise backup applications and product-specific backup products support these capabilities. This allows them to leverage the technology that disk backup appliances already have so they can focus their development resources elsewhere. For the data center, it means that the disk backup appliance can become the consolidation point, allowing each group to select its own backup application. <P> An important step for these disk backup appliances is to support tape. This would allow the disk backup appliance to directly move data to tape for offsite vaulting. While this may seem like an odd move for disk backup appliance vendors who have lived by the "tape is dead" mantra, it is actually the more pragmatic strategy. As I discussed in a <a href="http://www.storage-switzerland.com/Blog/Entries/2012/6/1_The_Hottest_Technology_of_2012_-_Tape.html">recent blog post</a>, large enterprises have continued to use tape alongside disk, while smaller data centers are returning to tape in order to curtail the growth of their disk backup appliances. I discussed this concept some time ago in my article, <a href="http://www.storage-switzerland.com/Articles/Entries/2010/3/18_Backup_Virtualization.html">Backup Virtualization</a>, and now we are seeing tape support become a common item on the disk backup appliance vendor's roadmap. <P> The final key step is for disk backup appliances to improve their reporting capabilities, reaching out to backup applications in order to correlate what the appliance has stored and what the backup application says it sent. It could then present a single "success/fail" report for the enterprise. While some disk backup appliances have basic reporting now, most need significant improvement. <P> If backup appliance vendors embrace these concepts, they could end up developing the backup consolidation solution so many users are looking for. <P> <i>Faster networks are coming, but security and monitoring systems aren't necessarily keeping up. Also in the new, all-digital <a href="http://www.informationweek.com/gogreen/103112s/?k=axxe&cid=article_axxt_os">Data Security At Full Speed</a> special issue of InformationWeek: A look at what lawmakers around the world are doing to add to companies' security worries. (Free registration required.) </i>2012-10-29T09:06:00ZHow To Develop Multi-Vendor Backup StrategyTry to find one app that gives good generic coverage, then pinpoint certain applications or environments that need extra attention.http://www.informationweek.com/news/240010596?cid=RSSfeed_IWK_authorsIn my last <a href="http://www.informationweek.com/storage/disaster-recovery/one-backup-app-for-enterprise-not-here-y/240009534">column</a> I discussed how consolidating all your backup processes to a single application is almost impossible given today's environment. It has been months since a column has generated that much email traffic. People are strangely passionate about backup. I heard from vendors questioning my sanity, and backup administrators who confirmed that multi-vendor backup is a reality but wanted some options on how to deal with the mixed environment. <P> An enterprise backup application is one that attempts to protect more than just one application, operating system, or environment such as VMware or Hyper-V. It also tends to support a broad range of backup devices such as tape libraries, disk backup appliances, and cloud providers. There are a dozen or so of these enterprise backup applications to choose from. <P> Point solutions or backup utilities are products that target a particular virtualized environment, or specific applications such as SQL, Exchange, SharePoint, and Oracle. There are dozens if not hundreds of these products on the market today. This creates a challenge for IT professionals entrusted with protecting an organization&#8217;s data assets. Do they use a single enterprise solution or a suite of backup utilities? <P> <strong>[ Read <a href="http://www.informationweek.com/storage/disaster-recovery/no-disaster-recovery-plan-no-excuse/240005645?itc=edit_in_body_cross">No Disaster Recovery Plan? No Excuse</a>. ]</strong> <P> We just completed a <a href="http://www.storage-switzerland.com/BT4EntBackup.html">webinar</a> about the challenges facing enterprise backup. We did a live poll of the viewers, asking how many backup applications they had in place. Not one organization had standardized on one backup application. But over 50% had more than three backup applications in use. <P> Clearly, consolidation isn't happening in backup, at least at the software level. Why? As much as the enterprise backup vendors will disagree, there are some things that the point applications do better. Although a case could be made that you have to give up some of the "cool" capabilities of these point solutions for the greater good of a consolidated enterprise backup strategy, many of these point solutions are particularly good at rapid recovery. They're so good, in fact, that managers of those applications might say that recovering their app rapidly is more important than the greater good. As a result, the enterprise ends up with multiple data protection applications. <P> So what to do? First, an application that can provide generic coverage of most of your environment is an important foundation of any data protection strategy. If the enterprise application can truly give you all you need, stop there. No need to complicate things. But I believe, and our research shows, that most enterprises will benefit from a few solutions targeted at certain applications or environments. <P> Once you accept mixed backup as fact, the next step is to manage the reality. Understand what backup application is backing up each set of data. In many cases we have found that one set of data is being protected by four or five different applications or utilities. Even the most critical of data probably does not need that many levels of protection. <P> Once you identify how many redundant copies of data you need of each data set and that those copies are getting made as scheduled, the next step is to develop an understanding of which data set you are going to restore from in different recovery scenarios. We have seen several cases where an application was down for hours while IT struggled to find the right backup copy to restore from. Knowing which copy is the "right" copy is critical. <P> In my next column I'll discuss the next step: Identifying an umbrella application that can at a minimum provide cross application monitoring and potentially provide a policy engine. Finally, we will end with potentially the best place to consolidate backups: At the backup storage device.2012-10-22T13:48:00ZOne Backup App For Enterprise: Not Here YetPatchwork backup systems for virtual and non-virtual parts of the enterprise, with no centralized view, make protecting data more time consuming and more expensive than it should be.http://www.informationweek.com/news/240009534?cid=RSSfeed_IWK_authorsData centers undergo constant change--new applications and operating environments are added all the time. With each of these changes, new data protection challenges arise. New software applications are created to solve the unique protection challenges these environments create. But it's hard to find a single data protection solution that answers all the demands of an enterprise. <P> An excellent case in point is the flood of products focused on protecting the virtualized environment, VMware specifically. These products are fine tuned to make data protection of the virtualized environment easier and to take full advantage of the abilities that the virtualized environment can provide such as changed block tracking for smaller backups and instant recovery of virtual machines from the backup target. <P> A potential weakness of these virtualization-specific backups is that they do not support protection of the non-virtualized part of the data center, which according to some reports is 50% or more of servers. Although the data center is moving toward 100% virtualization, it is going to take a while to get there. The remaining systems typically stand alone for a reason and virtualizing them might be a problem. The backup administrator is left running a separate backup application for as much as 50% or more of the data center. <P> As we discussed in two recent reports, <a href="http://www.storage-switzerland.com/Blog/Entries/2012/9/6_From_Backup_Utility_To_Complete_Solution.html">From Backup Utility To Complete Solution</a> and <a href="http://www.storage-switzerland.com/Blog/Entries/2012/6/12_Unifying_Virtualized_Backup,_Replication_and_Recovery.html">Unifying Virtualized Backup, Replication and Recovery</a>, some VM-specific backup applications are adding non-virtualized server backups to their capabilities. But there are other needs that so far only enterprise backup applications are providing such as tape support and robust online backups. <P> To make matters worse, other applications--databases in particular--often have their own eco-system of data protection solutions. As is the case with the virtualized environment, these utilities provide advanced data protection capabilities for their specific environment that enterprise applications don't. <P> This leads to a fragmented data-protection strategy where VMware administrators pick the application they want to protect their environment, application owners pick the application they want to protect theirs, and the enterprise backup application is left protecting the leftovers--and often protecting the virtualized and application environments a second time. The result is a data protection process that is more complex and more costly than it should be. <P> So how do you get the best of all worlds if you have mission-critical applications that are not going to be virtualized soon? As we will discuss in our upcoming webinar, <a href="http://www.brighttalk.com/webcast/5583/56309">The Four Things That Are Breaking Enterprise Backup</a>, the enterprise applications need to evolve into data protection engines that provide basic protection; advanced back-target support for tape and disk backup appliances; cataloging; scheduling; and policy management. <P> Many enterprise backup applications provide these capabilities today, but these capabilities are locked within the application. What they need to do is open them up by providing a framework or API set. Then application-specific data protection products could plug into these engines. This would give the backup administrator a centralized view of the data protection process but the application owners specific capabilities that make their jobs easier.2012-10-10T14:53:00ZFast SSD Storage, Slow NetworksCan't afford a network upgrade? Boost performance by enhancing your existing infrastructure with one of these solutions.http://www.informationweek.com/news/240008853?cid=RSSfeed_IWK_authorsSystems built with solid state disks (SSD) represent the cutting edge in performance and are the "go-to" option for data centers looking to solve performance problems. So these systems should be coupled with the absolute cutting edge in storage I/O performance, too, right? <P> In some cases, they do need a high-performance network. But in many cases, the server--or the application running on that server--cannot take full advantage of a high-speed network, even if they can take advantage of SSD. <P> While the options for high-speed networking are increasing and getting more affordable, it's still a considerable investment to upgrade a storage network infrastructure. Not only is there the cost of the physical components like switches, cards, and cabling, there's also the time required to reconnect each new server. <P> <strong>[ For more advice on affordable SSD storage, see <a href="http://www.informationweek.com/storage/data-protection/how-to-choose-best-ssd-for-midsize-data/240008584?itc=edit_in_body_cross">How To Choose Best SSD For Midsize Data Centers</a>. ]</strong> <P> This combined cost can be difficult to justify, especially if the server or application can't take full advantage of the new network. But in many cases, SSD can make a significant performance difference, even on a slow, un-optimized network. We proved this in our labs during <a href="http://www.storage-switzerland.com/Blog/Entries/2012/6/19_Does_The_SMB_Need_SSD_Performance.html">a recent test drive</a>. <P> If you don't have the budget to upgrade your network infrastructure, or if your servers can't take advantage of that upgrade, you have several options. One is to install a system designed to take advantage of SSD anyway. As I mentioned, some database applications see a significant performance increase simply by adding SSD and not upgrading the network. <P> Another option is to add SSD to the server using a drive form-factored SSDs or PCIe SSD. As I discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2012/9/24_Greater_VMware_ROI_With_PCIe-Attached_SSD.html">a recent article</a>, one solution is to install PCIe SSD in the most performance-sensitive hosts in your environment and then leverage the PCIe for memory swap space and as a read cache. <P> The storage network is still necessary to save new or changed data (writes), but that's all it is needed for. Reads come high-speed directly from the host's PCIe bus, so the storage network is essentially cleared for only write traffic. The result is improved performance for both read and write. <P> In some cases, you might need shared read and write flash performance. This is more economical if flash capacity can be shared across hosts via a network, but you still don't necessarily need to upgrade the network. As I discussed in <a href=http://www.storage-switzerland.com/Articles/Entries/2012/9/19_The_Hidden_Cost_of_Networking_SSD.html Networking SSD>another recent article</a>, some SSD systems now including built-in networking and can support direct attached 1GbE connections. Simply trunk 2-4 connections per host to provide the performance they need at a price the data center can afford. <P> If you don't have the budget or the applications to justify a high-speed network upgrade, try enhancing your existing infrastructure for an affordable way to increase performance. <P> <i>Even small IT shops can now afford thin provisioning, performance acceleration, replication, and other features to boost utilization and improve disaster recovery. Also in the new, all-digital <a href="http://www.informationweek.com/gogreen/082712smb/?k=axxe&cid=article_axxt_os">Store More</a> special issue of InformationWeek SMB: Don't be fooled by the Oracle's recent Xsigo buy. (Free registration required.)</i>2012-10-08T08:39:00ZHow To Choose Best SSD For Midsize Data CentersSolid-state disk storage effectively boosts performance in nearly any size data center, but midsize data centers have particular affordability questions.http://www.informationweek.com/news/240008584?cid=RSSfeed_IWK_authorsOne consistent truth we have seen about solid state disk (SSD) is that the technology can improve performance in almost any size data center. The problem for midsize data centers beginning to explore this technology is how to best afford it. The SSD vendors have a seemingly endless set of options for data centers to consider, but which one is best for the midsize data center? <P> Overall there are three basic ways to implement flash SSD in the environment. First, SSD can be added or installed to a server's drive bays via drive form factor solid state or via a PCIe card slot. Second, an appliance can be added to the network that either acts as a standalone storage system or as a cache for an existing storage system. And finally, SSD can somehow be integrated into a storage system. <P> Which one of these you should pick is largely dependent on two factors. First, how many servers are giving you a performance problem? Second, where are you in your storage refresh cycle? If you have one server that is giving you a particular performance problem, then SSD in the server is the quickest and often least expensive fix. First you have to make sure the application can leverage the flash device by having the ability to change hot-file locations, or you have to buy a caching solution (of which there are many now). <P> We are finding that the scenario of a single application being the only performance problem is becoming increasingly rare as the environment becomes more and more virtualized. This is especially true in the midsize data center, which tends to have much higher virtualized server ratios. If you have several virtualized hosts then a flash/SSD appliance or a hybrid storage system may be a better fit. <P> Flash/SSD appliances are ideal for adding to an existing infrastructure to extend the life of the current storage system. These flash-only systems have typically been out of reach of the midrange data center. Now, though, we are seeing shareable systems become available at a much lower cost and with a simple software feature set. The expectation is that you will leverage the storage management capabilities within the hypervisor instead of paying for them again in the storage system. <P> Hybrid systems are a new generation of storage systems that do more than just add flash to existing legacy storage solutions. They are typically custom built to support flash and its performance capability. When the time comes to refresh the storage infrastructure, these systems are well worth considering. As we will discuss in our upcoming webinar, "<a href="http://www.brighttalk.com/webcast/5583/56427">The Four Advantages to Hybrid Flash Arrays</a>," these systems allow the data center to leverage the use of solid state across a wide range of physical hosts, but still leverage HDD to keep costs in check. <P> In reality there is no perfect single solution for all midsize data centers; much of the choice depends on where you are in your storage refresh cycle and the design of the environment. Considering that most midsize data center are highly virtualized, any SSD investment will have to have be accessible by multiple server hosts either through local servers to hypervisor coordination on a storage network.2012-10-04T12:55:00ZStorage That Only Looks Like A SANIt isn't a perfect approach, but you can skip the SAN and still bring the advantages of local storage to a virtualized environment.http://www.informationweek.com/news/240008478?cid=RSSfeed_IWK_authorsIn my <a href="http://www.informationweek.com/storage/systems/how-to-share-local-storage/240008168">last column</a>, I discussed how some vendors are abandoning shared storage for virtualized environments in favor of local storage. Their goal is to reduce cost and complexity while increasing performance. <P> Through the use of inter-host mirroring and replication they can still provide many of the key features of virtualization, but there are some problems: You need a complete second copy of a virtual machine (VM) on another host, you are limited to only that second host for failover or migration (unless you make multiple copies), and there is CPU consumption required of the second target VM. Essentially, you double your VM count and the resources those VMs require. In a resource-constrained environment, this could be a problem. <P> Vendors are trying to deliver other solutions that keep the cost, simplicity, and performance advantages of local storage solutions but that still provide VM flexibility and efficiency. One approach is the SAN-Less SAN. <P> <strong>[ For more on shared vs. local storage, see <a href="http://www.informationweek.com/is-shared-storages-price-premium-worth-i/240008024">Is Shared Storage's Price Premium Worth It?</a> ]</strong> <P> The SAN-Less SAN is actually another form of shared storage, but the storage is in the physical hosts of the virtual infrastructure instead of on a dedicated shared storage system. Each host is equipped with hard drives or Flash SSD storage, and as data is being stored it is written across each host in the infrastructure--similar to how data is written across the nodes of a scale-out storage cluster. <P> Redundancy is achieved by using a RAID-like data stripping technique so that failure of one host or the drive of one host does not crash the entire infrastructure. As in traditional RAID, the redundancy is provided without requiring a full second copy of data. Also, it is not uncommon for the disks in each node to themselves be RAIDed via a RAID card inside the server. <P> This technique of striping data across physical hosts provides the VM flexibility. All the hosts can get to the VM images, so a VM can be migrated in real time to any host. <P> One downside of the SAN-Less SAN approach is that you lose the performance advantage of pure local storage since parts of the data must be pulled from the other hosts. From a performance perspective, you have essentially created a SAN. <P> As discussed in my article, <a href="http://www.storage-switzerland.com/Articles/Entries/2012/4/18_Building_The_SAN-Less_Data_Center.html">Building The SAN-Less Data Center</a>, some vendors are merging features of local storage with this SAN-Less technique to bring the best of both worlds. These vendors are keeping a copy of each VM data local to the host on which it is installed in addition to replicating the VM&#8217;s data across the host nodes. The value of this technique is that the VM gets local performance until it needs to be migrated. A second step in migration allows the newly migrated VM to have its data rebuilt on its new host, restoring performance. This is especially intriguing if the local data is PCIe Solid State Disk. <P> Of course, nothing is perfect, and the network that interconnects these hosts must be well designed. There is also some host resource consumption as the software that runs the data replication on each host does its work. However, that consumption should not be as significant as a host loaded down with target VMs in the mirroring/replication example discussed in <a href="http://www.informationweek.com/storage/systems/how-to-share-local-storage/240008168">my last column</a>. Finally, the type of hard disks and solid state disks used in the hosts in a SAN-Less SAN must also be carefully considered. <P> Despite the advantages of local storage and SAN-Less SANs, shared storage is far from dead. In my next column, I will look at local storage vs. SANs. <P> <i>Even small IT shops can now afford thin provisioning, performance acceleration, replication, and other features to boost utilization and improve disaster recovery. Also in the new, all-digital <a href="http://www.informationweek.com/gogreen/082712smb/?k=axxe&cid=article_axxt_os">Store More</a> special issue of InformationWeek SMB: Don't be fooled by the Oracle's recent Xsigo buy. (Free registration required.)</i>2012-10-01T09:03:00ZHow To Share Local StorageLocal storage can now provide capabilities like virtual machine migration and distributed resource management.http://www.informationweek.com/news/240008168?cid=RSSfeed_IWK_authorsAs I covered in my <a href="http://www.informationweek.com/is-shared-storages-price-premium-worth-i/240008024">last column</a>, local storage is beginning to give shared storage some competition as the storage platform of choice in virtual environments. In theory, shared storage should have an advantage since multiple hosts have to have access to the same virtual images for capabilities like virtual machine migration or distributed resource manager to work. Vendors that are promoting this SAN-less data center concept have developed alternative ways to share that data. <P> In this column we will discuss how those vendors are creating a SAN-less environment that still can provide capabilities like virtual machine migration while at the same time benefiting from the simplicity and performance of local storage. There are two common approaches to accomplish this feat: mirroring/replication and something we call the SAN-less SAN. In this column we will cover the mirroring/replication technique; we will discuss the SAN-less SAN in our next. <P> <strong>Mirroring/Replication</strong> <P> The simplest approach is a technique that leverages a mirroring or replication model. Basically an alternative host is designated and data from the first host is replicated to that host. A mirroring approach means that the data on the target host is 100% in sync with the data on the source host. A replication model means that the target host may be slightly out of sync with the source. <P> If the source host or a VM on the source fails or needs maintenance, the target host or VM can be moved into production. With the replication technique, the VM on the source has to be gracefully shut down for a clean startup on the target. The mirroring technique should be able to be started up instantly and should not need a graceful shutdown. <P> Initially, the mirroring or replication technique was popular in smaller data centers that simply needed availability more than they needed all the benefits of virtual machine migration. In fact, as we discuss in our article <a href="http://www.storage-switzerland.com/Articles/Entries/2011/9/28_For_the_Small_to_Medium-sized_Company_Backup_is_All_About_Recovery_TIME.html">"For the Small to Medium-sized Company Backup is All About Recovery TIME"</a> some backup applications now provide this functionality. They backup physical and virtual servers to a backup appliance and then in the case of failure or maintenance can host the virtual machine directly on the backup appliance. Essentially this solves two problems at the same time; providing virtual machine flexibility and data protection. <P> The mirroring concept, as we describe in our article <a href="http://www.storage-switzerland.com/Articles/Entries/2012/9/20_The_Benefits_of_a_Flash_Only%2C_SAN-less_Virtual_Architecture.html">"The Benefits of a Flash Only, SAN-less Virtual Architecture"</a> is becoming more appealing to enterprises because it leverages PCIe SSDs installed inside of hosts and integrates with the fault tolerance software that comes with the hypervisors. This is a particularly interesting approach. It takes full advantage of the performance capabilities of PCIe SSD while at the same time provides automated, 30 second failover of failed VMs or hosts. <P> The downside of the mirroring/replication techniques is you need another copy of the virtual machine image for each physical server that you want to have the potential to host a particular virtual machine. In the smaller enterprise this may not be problem since capacity is inexpensive and they often get more than they need. Also remember that, as we discussed in our last column, local storage should be less expensive than shared storage. <P> Another downside is that the target VMs are consuming some level of CPU and memory resource on the target hosts on which they reside. Again, especially in the small enterprise and even in the large enterprise, there is often plenty of excess CPU. <P> Finally, there is the downside of loss of flexibility. With shared storage any server connected to it can typically host any VM. With the mirroring/replication technique you are limited to just the designated target. This shortcoming is overcome by the second technique, The SAN-less SAN which we will cover in an upcoming column.2012-09-27T08:40:00ZIs Shared Storage's Price Premium Worth It?As local storage improves, shared storage proponents' justifications for investing more in SANs and NAS just aren't cutting it.http://www.informationweek.com/news/240008024?cid=RSSfeed_IWK_authorsVirtualization has propelled the adoption of storage area networks (SAN) and network attached storage (NAS) to new levels. That adoption comes with new levels of frustration as the task of operating shared storage, already difficult, becomes even more challenging in the virtualized environment. <P> Increasing numbers of vendors are encouraging IT professionals to just say "no." As we discussed in our recent article "<a href="http://www.storage-switzerland.com/Articles/Entries/2012/9/20_The_Benefits_of_a_Flash_Only%2C_SAN-less_Virtual_Architecture.html">The Benefits of A Flash Only, SAN-Less Architecture</a>," customers are looking for alternatives to shared storage. Local storage in the form of internal hard disks and even PCIe SSD has emerged as leading candidates to replacing the SAN. Local storage has developed workarounds for its biggest weakness: lack of share-ability. <P> In a future column we will begin to explore some of the "SAN-less" shared storage options. But how do we get to where we are? Why is the frustration with shared storage so high? <P> Generally, administrators site three sources of SAN frustration. First there is the cost of shared storage, which is almost always a premium compared to local storage. Second there is the frustration with having to constantly tune the storage and its supporting infrastructure, something that is increasingly problematic in the ever-changing virtual environment. Finally there is the frustration over the complexity of day-to-day management of the SAN. <P> <strong>[ Learn <a href="http://www.informationweek.com/storage/systems/how-to-choose-right-unified-storage-syst/240007223?itc=edit_in_body_cross">How To Choose Right Unified Storage System</a>. ]</strong> <P> In this column we will focus on the first frustration, the price premium. The premium price of shared storage is caused partly by the cost of the infrastructure required to share storage: the adapters that go into the servers and the switches that the adapters and the storage connect to. Of course this is data, so everything has to be redundant, which compounds the cost problem. <P> Another source of the price premium is the cost of the actual storage unit. It also must be highly available, so that means multiple ports, power suppliers, and storage controllers. Local storage also needs these same components and sometimes even in redundancy, but all these components exist inside the server they are being installed in, which reduces costs considerably. <P> Finally, shared storage almost always includes a variety of storage niceties that may not exist in local storage. Capabilities like unified storage (SAN/NAS), snapshots, replication, and automated storage tiering are commonplace in today's storage systems. While many vendors include these capabilities in the storage system at no additional charge, nothing is actually free; most shared storage vendors hold significantly higher profit margins than their local storage competition. <P> As we say in "The Benefits of A Flash Only, SAN-Less Architecture," shared storage proponents can no longer claim that the advantage of being shared is enough justification for this premium cost. In many cases they can't claim a performance advantage. Now operating systems and hypervisors are offering many of the nice-to-have features listed above, so that is holding less value as well. To justify their high price, shared storage solutions need to focus on one key area: offer greater capacity efficiencies than local storage. In other words, do the same job while requiring significantly less upfront and ongoing storage costs. <P> It should be able to do this in two areas. First it should be able to reduce the physical capacity footprint required in a shared environment. As we discuss in our article "<a href="http://www.storage-switzerland.com/Articles/Entries/2011/9/30_Which_Primary_Storage_Optimization_Strategy_is_Best.html">Which Storage Efficiency Technique is Best</a>," deduplication is an ideal way to reduce storage capacity needs, especially in the virtual environment. Deduplication also greatly benefits from the centralization of data. The more data there is to compare, the greater the chance of redundancy. The technology should become standard on all primary shared storage systems. <P> Shared storage should allow better use of storage since it can be assigned as needed to a given host. Local storage will almost always waste capacity and it can't allocate it to another server. This granular allocation is especially important with flash storage since this capacity is still premium priced. Shared storage can carve up the allocation of flash solid state to the exact requirements of each connecting host, or it can use it as a global pool accelerating only the most active blocks of storage. As a result, the total SSD investment may be less in shared storage than if storage is purchased on each individual server. <P> Local storage is not only winning on cost and its newfound ability to share, it is also gaining acceptance because of its performance and simplicity. These are topics we will cover in a future article.2012-09-24T08:45:00ZAutomation, Cloud Can Eliminate Storage HeadachesAfter virtualization, automation and the cloud are the next two most important ways to ensure storage provides the performance and capacity your company needs.http://www.informationweek.com/news/240007785?cid=RSSfeed_IWK_authors<!-- KINDLE EXCLUDE --><div class="inlineStoryImage inlineStoryImageRight"> <a href="http://www.informationweek.com/news/galleries/cloud-computing/infrastructure/232901167"><img src="http://twimgs.com/informationweek/galleries/automated/788/01_Transformation-1_tn.jpg" alt="Amazon's 7 Cloud Advantages: Hype Vs. Reality" title="Amazon's 7 Cloud Advantages: Hype Vs. Reality" class="img175" /></a><br/> <div class="storyImageTitle">Amazon's 7 Cloud Advantages: Hype Vs. Reality</div> <span class="inlinelargerView">(click image for larger view and for slideshow)</span> </div><!-- /KINDLE EXCLUDE -->In my <a href="http://www.informationweek.com/eliminate-storage-headaches-virtualize/240007542">last column</a>, I explained how leveraging virtualization can reduce and maybe eliminate many of the headaches that storage managers deal with when provisioning. But storage management is more than just provisioning storage. You also have to make sure that the storage provides the performance and the capacity that the application demands. The next two steps in eliminating storage headaches are to leverage automation, and to fully embrace remote storage cloud. <P> <strong>Automate</strong><br>Storage performance tuning is making sure that applications are getting just the right amount of performance at the right time. Too little performance means that applications or users are not being as productive as they might be; too much performance means expensive resources sit idle. In the virtualized or cloud data center, performance and making sure it is managed correctly is becoming a time-consuming part of the storage administrator's day. <P> The key to making sure the right data is on the right type of storage at the right moment is knowing when to leverage solid state storage and when to use hard disk storage. In the modern data center this means moving data to higher-speed devices when the applications and users demand it and moving it potentially into a remote storage cloud when they do not. <P> <strong>[ Where can you cut corners with commodity tech? See <a href="http://www.informationweek.com/storage/systems/storage-software-vs-hardware-whats-more/240006946?itc=edit_in_body_cross">Storage Software Vs. Hardware: What's More Important?</a> ]</strong> <P> Some virtualization software applications like the ones I described in my previous column will allow you to move entire volumes, but in most cases the entire volume does not need to be on high-speed storage. Storage automation software should be different. It should understand data activity at a granular, sub-file level for maximum resource allocation. A Sharepoint database for example, might need to be on high-speed storage for performance--but all the documents it manages do not. Over time these could be migrated to high-capacity, low-cost, secondary storage, or to remote cloud storage. <P> <strong>Embrace the cloud</strong><br>The cloud also should be an important part of any storage infrastructure and it, too, can reduce storage headaches. It can help with provisioning tasks so that they can be made more self-serviceable, which allows business application owners to handle their own provisioning requests based on policies and workflow. Cloud storage also can help with storage virtualization efforts if the cloud storage software can leverage multiple types of storage for the on-premise cache. <P> A key headache that gets resolved by cloud storage is dealing with capacity expansion. The right cloud storage application should be able to work with other storage automation features and keep the working set of data local, yet leverage the cloud for an almost infinite amount of capacity backend. <P> As we will discuss in our upcoming webinar, <a href="http://www.brighttalk.com/webcast/5583/55487">3 Steps To Use The Cloud To Eliminate Storage Administration Headaches</a>, a final key headache that using the cloud resolves is data protection and disaster recovery. Depending on the configuration, all data can be replicated in near-real time to the cloud, reducing the pressure on the local backup process. <P> In the case of a disaster that causes loss of access to the building, data in the cloud can be recovered from any other location. Recovery in the second site is as easy as re-installing the cloud software and mounting the cloud volumes. Data will be re-cached locally as needed, but applications can return to operation almost immediately. <P> IT is being pressured to be more responsive to the needs of the business and it is in IT's best interest to be seen as an asset to the organization, not a cost. Taking advantage of tools such as automation and the cloud makes storage responsive to the needs of IT so it can be responsive to the needs of the business.2012-09-18T14:15:00ZEliminate Storage Headaches: VirtualizeLearn how virtualization, automation, and cloud computing can make your data center a competitive advantage, not a cost center.http://www.informationweek.com/news/240007542?cid=RSSfeed_IWK_authorsThe primary objective of virtualization and cloud computing is to create a data center that is more responsive to the needs of the business, that enables technology to be a competitive weapon instead of a cost center. Storage has to play a key role in this evolution so it can support these Agile IT initiatives. <P> In our upcoming webinar <a href="http://www.brighttalk.com/webcast/5583/55487">3 Ways To Use The Cloud To Eliminate Storage Headaches</a>, we are going to discuss how to eliminate storage administrative headaches like provisioning, performance management, storage expansion, and data protection. The key to eliminating these headaches is to create a storage value infrastructure which has three steps: virtualize the storage, automate the storage, and cloud-extend the storage. This first column will be devoted to storage virtualization, and we'll cover the other two steps in our next column. <P> Storage virtualization can take two forms. First, the storage administrative processes like provisioning, LUN creation, and RAID configuration can be virtualized as part of the storage system so that you are not dealing with individual hard disk spindles. <P> Provisioning tasks are one of the storage management headaches, and virtualization eliminates them for the most part. Virtualization allows for storage administrators or application owners to simply "dial in" the amount of capacity they need and assign that capacity to a host or virtual machine. The storage software takes care of the rest in the background. <P> <strong>[ Where can you cut corners with commodity tech? See <a href="http://www.informationweek.com/storage/systems/storage-software-vs-hardware-whats-more/240006946?itc=edit_in_body_cross">Storage Software Vs. Hardware: What's More Important?</a> ]</strong> <P> The next type of storage virtualization builds on the above capabilities and extends them across multiple storage platforms, even if those storage platforms are from different vendors. In other words, the typical data center storage infrastructure. It is becoming increasingly apparent that no single storage solution can solve all of a data center's storage needs, especially as that data center grows. <P> The need for multiple storage solutions can cause a serious storage administration headache. Having to separately manage multiple storage systems from multiple storage vendors is time consuming and error prone. Storage hardware virtualization eases those headaches by providing a centralizing management and data services platform for these assets. These solutions provide flexibility in storage hardware purchases while at the same time unifying operations so that all tasks execute the same way, which reduces administration time and errors. <P> The remote storage cloud also has a role to play in storage virtualization. As we demonstrated in a recent test drive, <a href="http://www.storage-switzerland.com/Blog/Entries/2012/5/30_Tuning_The_Cloud_For_Primary_Storage.html">Tuning The Cloud For Primary Storage</a>, cloud storage software typically leverages a hybrid model, where local cache is leveraged for performance, but remote cloud storage is leveraged for almost limitless capacity. The cloud software will typically use any available storage system and then present this same "dial-in" provisioning. It can work in conjunction with other storage virtualization solutions for a totally unified on-premises / off-premises storage strategy. <P> The next step in creating a storage value infrastructure to reduce administration headaches is to leverage storage automation and to leverage cloud storage's innate data protection and disaster recovery capabilities. We'll cover these aspects in detail in our webinar as well as in our next column.2012-09-12T14:31:00ZHow To Choose Right Unified Storage SystemEvery major storage vendor now offers--or claims to offer--a unified storage system. In this first of a series we look at exactly what a unified storage system is and what it can do for your company.http://www.informationweek.com/news/240007223?cid=RSSfeed_IWK_authorsUnified storage systems are getting a lot of attention lately. These systems provide both block and file services across multiple protocols via a single storage system. They are an ideal option for data centers looking to simplify their storage infrastructure by centralizing all the demands of the data center onto a single storage system. <P> Every major storage vendor has a unified storage system of one kind or another and almost every startup we speak with is claiming to have a unified system. In fact, we are tracking over 35 startups that claim to have a unified storage system. Over the next several columns I'll provide some tips to sort through the maze of options available to you. <P> In this column we'll look at the confusion some vendors create when they claim that another vendor's storage system is really unified. As is always the case in technology and especially in storage, a single definition of a term can be hard to come by and that is true of unified storage systems. <P> <strong>[ Read about another "big" storage question: <a href="http://www.informationweek.com/storage/systems/storages-big-overkill-truth-about-the-tr/232900179?itc=edit_in_body_cross">Storage's 'Big' Overkill: Truth About The Trend</a>. ]</strong> <P> The truth is you are not really buying a unified system just to have a unified storage system. You are trying to find a storage system that will solve your capacity and performance problems while reducing system administration time. If the vendor that best meets your current needs isn't truly unified--but solves your problem at a price you can afford, then go with them. <P> <strong>What Is A Unified Storage System?</strong> <br>The word "unified" implies that a system supports multiple connectivity and access options. For most systems, it means that the unit can serve files (NAS) and blocks (SAN) from the same device. For others, it means that the system can only do block but you can connect via iSCSI or fibre. In the purest sense, a unified system should be able to provide block access across all the available connectivity options (fibre, iSCSI, FCoE, etc.) while at the same time providing file access across NFS and SMB. <P> What is really important is what your data center needs. Do you need high-performance NAS? If you are just serving up home directories to a small group of users, then probably not. But if you are also going to use NAS to store your virtual machine images, then the answer can quickly change to yes. If you can use fibre or iSCSI with either a file server or virtual NAS connected to the storage system, then you might not need a unified system at all. <P> Another challenge is that most unified systems tend to be better at one capability than the other. This means they are a NAS that figured out a way to provide block storage, or they are block storage with some sort of NAS function integrated. Although most environments will use a mixture of workloads, a particular workload will often be the most important. Make sure that you test the specific conditions and configurations that will be most important to you. <P> <strong>Keep Your Options Open</strong> <br>Unified storage systems are about options. Having options means you don't have to make the perfect choice now, nor do you have to predict future needs. You do need to know what is most important to your current situation from both an access and protocol perspective. You also want to make sure that the other components of the unified solution are acceptable to meet the lower-priority needs. <P> In our next column we will discuss some of the key features to look for in a unified storage system, such as SSD integration, mixed workload support, VMware integration, scalability, and efficiency. Then we will wrap up the series with a discussion on whether you should build your own unified system or buy a turnkey system. <P> <i>Extending core virtualization concepts to storage, networking, I/O, and application delivery is changing the face of the modern data center. In the <a href="http://www.informationweek.com/tech-center/storage-virtualization/download?id=189600017&cat=whitepaper?k=axxe&cid=article_axxe">Pervasive Virtualization</a> report, we discuss all these areas in the context of four main precepts of virtualization. (Free registration required.) </i> <P>2012-09-10T08:30:00ZStorage Software Vs. Hardware: What's More Important?Open source software and off-the-shelf hardware both play a role in the commoditization of storage. Consider these 3 questions to understand the issues.http://www.informationweek.com/news/240006946?cid=RSSfeed_IWK_authorsA common theme from storage software vendors over the last few years has been that storage hardware is becoming commoditized and that it is the software that really matters, not the hardware. It is a fair point. Entire companies have been built on the value that the software brings to off-the-shelf hardware. However, a recent trend has been for storage hardware companies to claim that it is the software that is becoming commoditized, not the hardware. <P> <strong>1. What Is Storage Software?</strong> <P> Storage software is the software that makes a bunch of disk drives act like a system. At its most basic level the storage software provides volume management, <a href="http://en.wikipedia.org/wiki/RAID">RAID protection</a>, and <a href="http://en.wikipedia.org/wiki/Logical_Unit_Number_masking">LUN masking</a>. Vendors advanced these capabilities significantly over the years and added features like snapshots, thin provisioning, replication, and clones. Most recently they have been adding some form of SSD automation via tiering or caching. <P> <strong>2. How Can Storage Software Become A Commodity?</strong> <P> Storage software can become a commodity by becoming so commonplace that it is included automatically with the operating system, file system, or hypervisor. Look at the capabilities of the open storage software products like ZFS, GPFS, GFS, MogileFS, Lustre, Nexenta, Datacore, Glustre, and Caringo (to name a few) and compare them with some the capabilities from turnkey storage vendors and you will be surprised at the capabilities of these products. <P> At first glance you may think that this bolsters the argument that software solutions will make the storage hardware a commodity, until you realize that many of the above software solutions are either open source or very aggressively priced. Once something is available for free, something that will never happen to hardware, then it is by definition commoditized. <P> <strong>3. Does Storage Hardware Suddenly Matter?</strong> <P> There are three key drivers to why storage hardware suddenly matters. The first driver is being caused by flash memory. How vendors integrate flash will directly impact your experience with the system. As we discuss in our recent video "<a href="https://www.youtube.com/watch?v=5CwX886CW6E">The SSD Price Problem</a>" flash storage is not all created equally and the actual flash NAND is one small component of the overall flash solution. A more vertically integrated solution may deliver better performance and density at a better price point. <P> The second driver is the network. As the performance of storage systems begin to scale, the cost and complexity of the storage network become an issue. As we discuss in our article "<a href="http://www.storage-switzerland.com/Articles/Entries/2012/7/25_In_Open_Storage_The_Storage_Infrastructure_Matters.html">In Open Storage The Storage Infrastructure Matters</a>," some storage hardware vendors are pre-integrating low-cost networking options into their storage offerings so that the cost of the storage network does not become greater than the storage itself. <P> Finally there is reliability. We repeatedly see evidence in our labs and in talking with customers that certain vendors deliver higher levels of reliability than others. They accomplish these higher levels of reliability not only by better testing but also better design. As we discuss in our article "<a href="http://www.storage-switzerland.com/Articles/Entries/2010/3/30_The_Requirements_for_Building_Reliable_Storage_Systems.html">The Requirements for Building Reliable Storage Systems</a>," better storage hardware designs can reduce vibration and increase air flow so that drives run cooler. Vibration and heat tend to be the top killers of hard drives. <P> The end result is that both storage hardware and software are being commoditized at different levels. There are plenty of systems available that are really software leveraging off-the-shelf hardware, there are systems with hardware that can leverage a variety of software, and there are systems that have commoditized everything (hardware and software). <P> What makes the most sense for your data center depends largely on how much time and motivation you have. The more commoditized approach you go, the more assembly that is required. It will save you money, but may cost you time.2012-09-05T12:10:00ZFlash First: Your Next Storage Strategy?As flash storage costs decline, its performance advantages over hard drives become even more appealing.http://www.informationweek.com/news/240006733?cid=RSSfeed_IWK_authorsMany IT departments have a virtualize-first strategy. This means that anytime a new server is requested, the default reaction is to virtualize that server. A standalone physical server requires special justification. We may be heading the same way with storage, where new storage additions are flash first, and hard drives are used only for storing active data. <P> Cost has been the key hindrance for solid state device (SSD) adoption, but reducing that cost per effective GB is a key reason that data centers will move to a flash-first strategy. As we discuss in our recent article "<a href="http://www.storage-switzerland.com/Articles/Entries/2012/8/14_SSD_Can_Achieve_HDD_Price_Parity...NOW.html">SSD Can Achieve HDD Price Parity</a>," continued advances in flash controller technology, combined with advanced flash storage system design, have made it possible for flash SSD systems to achieve price parity with enterprise disk storage systems. A key is the enablement of multi-level cell (MLC) based flash systems, which essentially combine consumer grade flash NAND with advanced controllers to deliver enterprise reliability into a system that provides enterprise redundancy. <P> On top of safely using MLC-based SSD to drive down price, there is almost a universal adoption of deduplication and/or compression in the flash appliance market. The combination can provide a five times or greater effective capacity, and flash has the performance capabilities to support the additional workload of flash lookup. All deduplication is not created equal though, and as we pointed out in our recent webinar "<a href="http://www.brighttalk.com/webcast/5583/49481">What is Breaking Deduplication</a>," users and suppliers need to pay careful attention to make sure deduplication does not become a performance problem as their systems scale in capacity. <P> With cost issues being addressed so rapidly, the other reason for a flash-first strategy is that initiatives like server and desktop virtualization have made storage performance bottlenecks a near-universal problem in the data center. The random I/O that a host loaded with even a few virtual machines is significant and can easily tax hard drive-based systems. This problem will increase as the VM density per host increases with each processor upgrade. Random I/O is, of course, the flash storage trump card. Other than DRAM-based systems, nothing responds to random I/O faster than flash. <P> Capacity and capacity management are also less of a concern now. Certainly data continues to grow, but designing a system large enough to store all an organization's data is not that difficult. What was difficult was storing all that data and keeping storage response time acceptable. Flash resolves the performance problem, and there is a suite of tools and systems that will manage the movement of active data to a flash storage device. <P> Finally, thanks to the performance advantage and easier to justify price point, flash makes the storage administrator's life easier. Once everything, or almost everything, is on flash, the job of performance tuning and scaling virtual machine density becomes significantly easier. Also there are so many ways to implement and leverage flash that you don't have to wait for your storage refresh budget to come through. Flash can be added via a standalone appliance, in the server host, or as a network cache to solve specific performance problems right away.2012-08-28T13:38:00ZThe State Of Virtual Data Protection And RecoveryHybrid physical/virtual storage environments present their own challenges to data protection and backup. Start with a solid plan.http://www.informationweek.com/news/240006390?cid=RSSfeed_IWK_authorsProtection of the virtualized data center is evolving. Legacy products that were around before the dawn of server virtualization are beginning to catch up, feature-wise, with products that came about as a result of virtualization, a category that we call VM-specific backup utilities. It's no longer easy to justify a dual-pronged approach to backup that involves one product for the virtualized environment and a different one for the physical environment. <P> Increasingly, virtual data protection is being done less by VM-specific backup utilities and more by enterprise backup applications. These applications offer support for multiple operating systems, tape support (which remains important), and improved support of the physical environment, while at the same time leveraging the virtual environments' abilities in backup and recovery. <P> At the same time, in order to stay relevant, VM-specific backup utilities are becoming more enterprise-oriented. As discussed in my article,<a href="http://www.storage-switzerland.com/Articles/Entries/2012/8/16_Advancing_the_State_of_Virtualized_Backup.html"> "Advancing The State Of Virtualized Backup,"</a> at least two of these products have recently added support for physical server data protection, and several have expressed an intention to bring tape support to their software as well. <P> <strong>[ What should you expect from the storage system that supports your virtual infrastructure? Read <a href="http://www.informationweek.com/storage/systems/vmware-and-storage-start-with-basics/240005450?itc=edit_in_body_cross">VMWare And Storage: Start With Basics</a>. ]</strong> <P> Essentially, IT vendors are starting to realize that the data centers of today and of the near future will be hybrid environments with large numbers of both stand-alone physical servers and virtual servers. Many of these stand-alone servers are stand alone because of the mission criticality or resource requirements of the applications they host. <P> This hybrid virtual/physical environment makes the disaster recovery process more complicated as well. As discussed in <a href="http://www.storage-switzerland.com/Blog/Entries/2012/8/27_Recovering_the_Hybrid_Environment.html">this recent video</a>, the environments are often intertwined, but each one uses different data protection tools and storage hardware. That means special consideration must be made at the recovery site to ensure that the technology at the remote site can recover both the physical and virtual environments. <P> The net impact is this: data protection and recovery is still not as push-button simple as we would like it to be. While virtualization has helped by making servers more like moveable digital containers, it has also added layers of complexity to the disaster recovery process as we deal with the differences between physical and virtual environments. <P> In the end, virtual data protection and recovery comes down to making sure you have the right procedures in place, and that you can recover data, servers, hosts, or the entire environment if and when you need to. Virtualization may have brought some level of push-button simplicity to recovery, but a well-trained IT team armed with a solid plan remains the most important asset in any organization. <P> <i>InformationWeek has published a report on backing up VM disk files and building a resilient infrastructure that can tolerate hardware and software failures. After all, what's the point of constructing a virtualized infrastructure without a plan to keep systems up and running in case of a glitch--or outright disaster? Download our <a href="http://informationweek.com/tech-center/storage-virtualization/download?id=189200009&cat=whitepaper?k=axxe&cid=article_axxe">Virtually Protected</a> report now. (Free registration required.) </i>2012-08-22T09:06:00ZStop SSD SprawlSolid state storage devices are being implemented in servers, networks, and storage systems, leading to sprawl and performance problems in data centers. This has to change.http://www.informationweek.com/news/240005957?cid=RSSfeed_IWK_authorsMost IT managers and storage administrators have come to embrace solid state devices (SSD) as they look at performance options. Vendors have also embraced this reality and have delivered SSD solutions in almost every imaginable form factor and location in the storage infrastructure. There are multiple in-server SSD solutions, multiple in-network SSD solutions, and multiple in-storage SSD solutions. There are so many performance options we are now seeing them sprawl. <P> At the <a href="http://www.flashmemorysummit.com/">Flash Memory Summit</a> we delivered a presentation based on our latest chalk-talk video "Performance Sprawl", which focuses on what is becoming a growing concern in the data center. Performance sprawl occurs when multiple SSD solutions are chosen to fix different performance problems in the enterprise. <P> Thanks especially to server-based SSDs, performance has become a business unit line item not a storage decision. Of course, the storage team eventually is left managing whatever the business units buy and a performance management nightmare develops. We are seeing an increasing number of data centers that have job titles like "Storage Performance Specialist." This won't scale so something needs to change. <P> The change will come from "Cross Domain Data Movement." This technology will move data between multiple storage tiers and locations of storage. Data could be moved from hard drive storage to flash on the storage system and then eventually to SSD in the server. Even within the server there may be a desire to have some data on a SATA/SAS-based SSD drive and some data on PCIe SSD. <P> <strong>[ What's next for solid state storage? Read <a href="http://www.informationweek.com/storage/systems/storage-players-try-to-improve-solid-sta/240005921?itc=edit_in_body_cross">Storage Players Try To Improve Solid State</a>. ]</strong> <P> The challenge is that fully implemented Cross Domain Data Movement will require some time to develop, potentially as much as two years. Of course, the data center will continue to experience data storage performance issues over the next two years, so what's the strategy until then? Triage. Try to pick one solution that will address most of your performance needs for the longest period of time. <P> If you will have just a few servers that need a performance improvement then SSD inside the server may be ideal. There is also the option of leveraging PCIe in every server and making the storage network a capacity backend for data protection and archive. The economics of this approach may now make sense as PCIe-based SSD continues to come down in price and increase in capacity. <P> If you have multiple servers or hosts that need performance or if the data needs to be accessed by multiple hosts then an in-network SSD appliance makes sense. Selection will come down to the ones that support the protocol you use (NAS, Fibre, iSCSI). If you are fortunate enough to be ready for a storage refresh, then an All-Flash or Hybrid Flash Storage system may be the ideal platform to build your data center on for the next few years. It may also provide the longest term investment protection allowing you to wait as long as possible for cross domain data movement. <P> <i>New innovative products may be a better fit for today's enterprise storage than monolithic systems. Also in the new, all-digital <a href="http://www.informationweek.com/gogreen/072312/?k=axxe&cid=article_axxt_os">Storage Innovation</a> issue of InformationWeek: Compliance in the cloud era. (Free with registration.) </i>2012-08-14T15:55:00ZVMware And Storage: Start With BasicsProvisioning of storage to new hosts and virtual machines (VMs) remains one of the more time consuming tasks in the enterprise.http://www.informationweek.com/news/240005450?cid=RSSfeed_IWK_authorsVMware has been good for the storage industry, but maybe not so good for the storage administrator. Provisioning of storage to new hosts and virtual machines (VMs) remains one of the more time consuming tasks in the enterprise. The speed at which storage can respond to the random workloads of the virtual environment is one of the biggest bottlenecks in performance. And the cost to store all the data that VMs create remains one of the biggest costs of the virtual infrastructure. <P> Storage vendors have flooded the market with various solutions and the options can be overwhelming. As we will discuss in our upcoming webinar <a href="http://www.brighttalk.com/webcast/5583/53113">"The Requirements of VM Aware Storage"</a>, while every data center is unique there are some basics that you should now expect from the storage system that supports your virtual infrastructure. <P> <strong>Solid State Disk</strong> <P> Solid state disks (SSD) are potentially the best answer to the above mentioned random I/O that the virtual environment creates. There is now almost universal agreement on that point. How to implement SSD into the environment is where the disagreement occurs. It is obvious, for the time being, that hard disk (HDD) storage systems will continue to be a mainstay of the data center. The cost of capacity on HDD is simply too good to ignore. <P> The SSD system you select will largely be dependent on where your current HDD storage system is in its lifecycle and just how bad a performance problem you have. For organizations that need to get a few more years out of their hard disk-based storage systems, a caching appliance or a standalone SSD appliance can be ideal options. For organizations that are ready for a storage refresh, a tightly integrated Hybrid SSD as we discussed in <a href="http://www.storage-switzerland.com/Articles/Entries/2012/7/26_Hybrid_SSD_Storage_vs._Unified_Storage.html">"Hybrid SSD Storage vs. Unified Storage"</a> may strike the right cost/performance balance for them. Or it may be time to step up to an All-Flash Array, which leverages deduplication and compression to deliver top end but still affordable performance. <P> <strong>VM Aware</strong> <P> Before virtualization, troubleshooting consisted of monitoring the LUN or volume assigned to the connecting server. Now those servers are hosts, with dozens of VMs on them. Even VMs are more than a bunch of blocks on disk. There are components that have different internal parts. There is the system state, the parts of the server itself, and these tend to have write heavy I/O traffic. And there is of course that server's data, which is typically read heavy. All of this means that the storage system needs to understand not only what is going on inside of the host but also inside the VM. This is critical information to make sure that the right data is on the right storage at the right time. <P> <strong>VM Optimized</strong> <P> Performance is not the only challenge, managing capacity is equally important. In the typical virtual environment, all data has been moved to a shared storage device of some kind. This storage needs to be optimized as much as possible. Techniques like deduplication, cloning, and thin provisioning should all be leveraged to extract maximum dollar per GB stored. <P> The intelligent use of SSD, the ability to understand what VMs are doing, and the ability to optimize the capacity being consumed are foundational for the virtualized architecture. They are critical to pushing the data center closer to the 100% virtualized goal while at the same time making sure that the virtualization ROI is maintained. <P> <em>Follow Storage Switzerland on <a href="http://twitter.com/storageswiss">Twitter</a> <P> George Crump is lead analyst of <a href="http://www.storage-switzerland.com">Storage Switzerland</a>, an IT analyst firm focused on the storage and virtualization segments. Storage Switzerland's <a href="http://www.storage-switzerland.com/Disclosure.html">disclosure statement</a>.</em> <P> <i>New innovative products may be a better fit for today's enterprise storage than monolithic systems. Also in the new, all-digital <a href="http://www.informationweek.com/gogreen/072312/?k=axxe&cid=article_axxt_os">Storage Innovation</a> issue of InformationWeek: Compliance in the cloud era. (Free with registration.) </i>2012-08-07T13:31:00ZHow To Solve 2 VDI Performance ChallengesHere's how to fix storage performance problems that creep up after you consolidate hundreds of desktops onto a single host.http://www.informationweek.com/news/240005107?cid=RSSfeed_IWK_authorsIn a recent column, <a href="http://www.informationweek.com/news/storage/systems/240004622">Overcome Cost Challenges Of VDI</a>, we looked at cost, the key roadblock to a virtual desktop infrastructure (VDI) project. If you can't justify the cost of the investment, then all the other issues are moot. The good news is that many storage systems have implemented cost-saving techniques that allow the VDI justification process to move to the next step: How to deal with the performance issues that the environment can cause. <P> It seems odd that storage performance for the VDI is a problem. After all, most users' desktops have very modest performance demands. But it is the consolidation of potentially hundreds--if not thousands--of desktops onto a single host that causes the problem. While each may only need modest performance, the combined random storage I/O of so many users can put a storage system to the test. Again, as we mentioned in our recent column, that performance has to be delivered cost effectively. <P> Beyond the need to provide consistent performance to all of these desktops, there are two specific situations in the VDI use-case that storage infrastructures need to prepare for. First there are boot storms, or what occurs to the VDI when users all arrive at work and start the login process at about the same time. The storage system gets flooded with these requests and may end up so saturated that it can take five to 10 minutes before users' desktops are ready to go. <P> Much of the storage industry gets stuck on the boot storm issue and focuses solely on solving that issue. As we discuss in our recent article <a href="http://www.storage-switzerland.com/Articles/Entries/2012/7/16_VDI_Storage_Performance_Is_More_Than_Just_Boot_Storms.html">VDI Storage Performance Is More Than Just Boot Storms</a>, there is an equally important second problem; the amount of write I/O that storage systems supporting VDI need to deal with. There is the standard write I/O that a user's desktop will create, but multiplied by 1,000 in the VDI. And there is also the write impact caused by the heavy use of thin-provisioned masters and clones. These are the cost-saving techniques that we described in our last column--and write I/O performance is their downside. <P> Thin provisioning, with masters and clones being used each time a new piece of data has to be written to a virtual desktop, has to have additional capacity and needs to be prepared for the desktop file system, before the data can finally be written. The combination of all of these steps, again multiplied by thousands of desktops, can lead to latency that will impact user performance and inhibit user acceptance of the VDI project. Hypervisor file systems are particularly inefficient at managing these dynamic write-allocation issues. <P> The almost universal solution is to leverage solid state devices (SSD) to alleviate both issues. The problem is: Which SSD implementation method should you use? There are tiering/caching approaches that automatically move the blocks of data needed for virtual desktop boot to an SSD tier, but some of these solutions don't assist write I/O performance. There are the flash-only array solutions that address both read and write traffic, but the cost premium needs to be dealt with. <P> Instead of throwing hardware at the problem it may be better to address the root cause, the hypervisor's file system. As we discuss in our article <a href="http://www.storage-switzerland.com/Articles/Entries/2012/7/2_How_to_Afford_SSD_For_VDI_or_Virtual_Servers.html">How To Afford SSD for VDI</a>, fixing the file system first may reduce the amount of SSD required for VDI performance demands. <P> The performance challenges that VDI creates for the storage infrastructure can be overcome. The key is to overcome them as cost effectively as possible. Leveraging solutions that deliver the right mix of storage efficiency techniques and the right amount of high performance SSD can deliver the performance/cost balance needed to make the VDI project successful. In our final column in this series we will detail how storage systems should be designed to help you strike that balance. <em>Follow Storage Switzerland on <a href="http://twitter.com/storageswiss">Twitter</a> <P> George Crump is lead analyst of <a href="http://www.storage-switzerland.com">Storage Switzerland</a>, an IT analyst firm focused on the storage and virtualization segments. Storage Switzerland's <a href="http://www.storage-switzerland.com/Disclosure.html">disclosure statement</a>.</em>