The wide area network (WAN) is used for a variety of purposes, mostly to connect branch offices to the main data center to share applications and data. Optimization of these connections is critical since the majority of the links are relatively slow. The demand to do remote replication and backup, though, is driving another type of WAN use, data-center-to-data-center communication. This storage WAN has its unique optimization requirements.
What's Driving The Storage WAN?
Not too long ago, replicating data from one storage system to another was an expensive disaster recovery option. Now many storage systems include data replication for free; all you need is a second storage system from that vendor. Also, in many cases you no longer need specialized storage hardware, as the hypervisor, third-party replication applications, and many database applications can replicate to a remote storage system. As a result, more companies are able to afford replication to protect against disaster, and they are also able to afford to replicate a larger segment of their data than ever.
The rest of the data that is not being replicated directly from primary storage can be easily replicated as part of the backup process. Thanks to deduplication, the ability to efficiently use the WAN to electronically vault backups is extremely popular. Most backup appliance vendors report over 50% of their customers are buying two sets of appliances and actively using backup replication.
Impact On The WAN
The result of just these two tasks means that big chunks of data are now streaming across the WAN segment in very short windows, typically overnight. Compare this to branch office communications, where traffic consists of small updates of email or application databases and access to file shares. As we discussed recently in our webinar "The Five Questions WAN Optimization Vendors Don't Want You To Ask," the optimization requirements are different when dealing with the storage WAN instead of the branch office WAN.
Storage WAN Vs. Branch Office WAN
The single biggest difference is the available bandwidth. In most cases, storage WANs are data-center-to-data-center communications and the bandwidth rate is much higher. So fast, in fact, that deduplication, a common branch office optimization method, may not be appropriate for the storage WAN. Deduplication takes time and is a balance between how much time it will take to do the analysis versus how much time it will save on the WAN transfer. The faster the WAN, the less time deduplication has to do its job. For the storage WAN, this may mean turning deduplication off or investing more in processing power on the WAN optimizer so that the deduplication process does not add latency to the data transfer.
The available bandwidth also puts greater pressure on the WAN optimizer to efficiently use the available network connections to their fullest. This may mean customized internal switching or more sophisticated code to use the available ports to their maximum.
Getting maximum use out of the available network ports is also important because of the speed at which data can be sent to the on-premises WAN optimizer. Storage WAN data is not generated by small clients, but by storage and backup appliances that can send data to it at very high speeds. As above, the WAN appliance needs to have the available ingest and output capabilities so it does not become its own bottleneck.
The faster speed of the connection and the large amount of data that has to be transmitted in a shorter time window means that the storage WAN requires a different approach to WAN optimization than does branch office optimization. Tried and true branch office optimization technologies like deduplication need to be rethought to see if--and how--they should be applied to the storage WAN.
From thin provisioning to replication to federation, virtualization options let you reclaim idle disks, speed recovery, and avoid lock-in. Get the new, all-digital Storage Virtualization Guide issue of Network Computing. (Free registration required.)