Scale matters. When it comes to datacenter infrastructure, scale separates technology that enables an organization to thrive from technology that constantly gets in the way as it crumbles under growth. Nowhere is scale more important than when it comes to storage, and in particular file storage. Files in the form of email, documents, and other unstructured information make up 80% of enterprise data, and files grow at a faster rate than all other data types combined.
Cloud storage enables file synchronization at scale.
The rush to virtualize everything moved every file server into the SAN while forcing file-specific platforms, like NAS, to become dumb-block devices with an emphasis on performance and cost rather than on scale. But files did not go away. While datacenters embraced virtualization, the file footprint continued to double every two or three years. Traditional storage, data protection, and replication strategies are running out of steam against the relentless growth of file data. Even the leanest IT organizations can expect their storage costs to continue rising even as they upgrade their SANs and try to enforce unpopular capacity quotas on their users.
I recently met the CIO of a large financial institution who had been asked to cut IT costs 8% per year. Internally, there was a strong push toward outsourcing or offshoring some core IT functions to achieve these savings. He told the organization that this approach may reduce costs in the short term, but that the savings would last for only one year, primarily due to data growth. He has just over a petabyte under management and expects that number to reach 2 PBs in 2015. Most of that data is files which are subject to lengthy retention policies by the financial regulatory bodies. He asked his staff to stop thinking about how to save money in small, incremental ways and, instead, to rethink their data storage strategy completely.
[Want to learn more about newcomers in solving storage's growth problem? See Separating Storage Startups From Upstarts.]
The core of any cloud storage system is an object store. Object stores power the largest public clouds, like Amazon Web Services' S3 and Microsoft's Azure, as well as some of the private cloud storage systems from the likes of CleverSafe and EMC. The physical underpinning of any object store is a cluster of servers, a.k.a. nodes, each with its own direct-attached storage, each connected through Ethernet. The hardware used for the nodes is nothing special. The magic is all in the software.
An object store does one thing really well: data replication reliably and at scale. The replication of the objects not only protects the data but also serves to amplify access to data by allowing objects to be read from many nodes in the cluster. This is not unlike the behavior of content distribution networks. A proper object store core gives an organization access to a massive replication engine at a ridiculously low price point.
But the object store is only half of the strategy. The other half is a new generation of storage controllers that bridge the gap to support traditional infrastructure workloads. Cloud-integrated storage (CIS) devices (also called cloud gateways) from the likes of cTera, Nasuni, and Panzura look and feel like traditional NAS devices and provide the same advanced capabilities of traditional NAS.
But, behind the scenes, CIS devices transform every file and every file change into a unique string of smaller files or objects that are then shipped to the object store core. There, each object is immediately replicated among nodes and geographically dispersed across datacenters. By transferring the state of their file systems natively to the object store, these reinvigorated NAS platforms can scale to absorb an unlimited number of files or snapshots regardless of the size of the device. Files that are synchronized to the object store core are immediately replicated and available anywhere in the world, which enables shared access to files from multiple locations and whole new ways to deliver disaster recovery and business continuity. This is the scalable infrastructure version of personal productivity applications like Dropbox and Box.
Scale changes everything. Organizations can expect dramatic and sustainable cost savings by shifting file data to cloud-based architectures. The financial institution I mentioned above has chosen to deploy a 1.3-petabyte core object store with CIS appliances that will replace their midrange storage controllers in dozens of locations. The system will create a unified storage platform that uses synchronization at scale to protect and make data available to hundreds of locations around the world. The cost savings that result from shifting to and growing within the new model will neutralize even the most aggressive data growth projections.
IT organizations tend to be extremely conservative, but tackling the data explosion requires bold new thinking, and the abandonment of tried and tired methodologies.
Solid state alone can't solve your volume and performance problem. Think scale-out, virtualization, and cloud. Find out more about the 2014 State of Enterprise Storage Survey results in the new issue of InformationWeek Tech Digest.