Sometimes one is better than many. Nowhere is that more apparent than in data storage. The traditional way to protect data is by copying it multiple times. These copies, combined with the need to provide access to shared data in multiple locations, create an unwanted explosion of duplicated data in an infrastructure already strained by years of relentless data growth.
Companies struggling under the combined pressure of Tier-2 file-data growth and an expanding global storage footprint increasingly rely on a new generation of global file systems. They enable IT to consolidate all copies of data required for protection and access into a single master copy.
File-data consolidation elevates the IT conversation from the nuts and bolts of storage provisioning, backup, disaster recovery, etc., to a discussion about data management that is well aligned to the business: Who needs access to this data? Where do they need it? What level of performance is required?
A global file system brings much needed order and efficiency to IT organizations struggling with data growth across multiple geographies. That said, the requirements for stability, scale, performance, security, and even the central management of such a global infrastructure technology are formidable and must rely on a new hybrid architecture that combines the nascent resources of the cloud (infrastructure providers) with more traditional data center technology.
[Want to learn more about how intelligence in storage management is shifting? See Storage Intelligence Moves Up The Stack.]
In order to achieve data stability at scale, the modern global file system uses a combination of versioning and replication. This is similar to traditional snapshot and replication techniques. In this case, however, the snapshots cannot be limited to a certain maximum number. Otherwise, the system will not allow IT to enjoy one of the most powerful features of the cloud: an infinite capacity for versions that eliminates the need for backup.
Each file in a modern file system contains every version of itself, which is then replicated across multiple availability zones by the cloud infrastructure providers in order to protect against a single zone's failure. Providers like Amazon and Microsoft are adept at this sort of massive, geographically dispersed replication, which carries the added benefit of increasing the fluidity of data and the speed at which data can be accessed from anywhere in the world.
Performance requirements for Tier-2 file data can vary wildly. A few thousand engineers working on a chip design require much higher performance than does a branch office working on spreadsheets. I know an ambitious executive VP of infrastructure who is using a global file system to allow dozens of production sites worldwide to share a 15-terabyte work set of video data, while some of his smaller sites require only relatively low-performance access to Microsoft Office documents.
The key to performance then is flexibility, but in order to enable file-data consolidation, the global file system must take IT out of the business of copying data to each site where performance is needed. Instead, it must rely on the fluidity of the cloud infrastructure backend and on caching and pre-staging algorithms that move the data that's needed into data centers, where the job to deliver the necessary I/O per second falls to the hardware appliances racked at each location.
And because data is not actually stored in the data center but in the cloud, modern global file systems must secure data at the end points by allowing IT to generate and guard its own encryption keys.
Global file systems that connect disparate geographies can eliminate the stress introduced by distance and give every worker equal access to data in the infrastructure. They do so by turning traditional data center thinking inside out. It moves central management out of the data center and into a global core service that can monitor and manage every system regardless of its location. This is a shift that has already gone mainstream in networking with systems like Aerohive and Cisco's Meraki. It is only a matter of time before this same model simplifies storage.
Rather than relying on the data center to be the center of everything, data centers become endpoints for security and the required level of performance, while every other infrastructure function shifts to a core cloud infrastructure.
To imagine a truly global storage infrastructure, one must think beyond the confines of any one physical appliance or data center. Only then can organizations harness the power of one copy of data, protected in many ways, accessible anywhere.
If you just look at vendor financials, the enterprise storage business seems stuck in neutral. However, flat revenue numbers mask a scorching pace of technical innovation, ongoing double-digit capacity growth in enterprises, and dramatic changes in how and where businesses store data. Get the 2014 State of Storage report today. (Free registration required.)