Infrastructure // Storage
News
9/4/2013
03:15 PM
Connect Directly
RSS
E-Mail
50%
50%
Repost This

Object Storage Vs. Overloaded File Servers

Growth in unstructured data is breaking the capacity of many file servers and NAS systems. One possible solution: object storage with its space-saving flat file system.

File servers and network-attached storage (NAS) systems are suffering from a data deluge of unbelievable proportions. This growth is coming primarily from unstructured data. In other words, data not in databases. The fact that data is growing is of no surprise. The pace at which it is growing is taking some data centers off guard.

Unstructured data is no longer files from office productivity applications. Although the number and size of those files is growing, the real problem is coming from media such as videos and podcasts and machine-generated data from devices such as Wi-Fi security cameras.

The Problem With NAS

This growth in unstructured data is breaking many file servers and NAS solutions. First, there is the hard capacity limit that these systems have built into them. The further a system can scale capacity, the more that system costs up front, an expense that an organization might not be able to justify. The alternative is to buy a smaller system that costs less but needs to be replaced more often.

[ Microsoft a storage heavyweight? Read Is Microsoft Ready To Be A Storage Player? ]

Scale-out storage was supposed to be a solution to this problem. And largely it is. Scale-out NAS allows for multiple NAS heads to be clustered so that their capacity is aggregated and they can be managed as one distinct unit. The challenge for scale-out storage systems is that they might start too large for some organizations because you need enough nodes -- typically three -- to establish the initial cluster. Also, many scale-out NAS solutions won't allow you to mix node sizes, so you can't start with "small" nodes and then add big ones later.

The second problem is universal: both scale-out and traditional NAS systems have a finite limit on the number of files that they can support before performance is affected. The performance impact occurs long before theoretical file limit for the NAS is reached. Every file has metadata, and the NAS has to maintain and use that metadata to serve files and protect that data. The more files there are the more metadata that needs to be managed. This management takes processing power of the NAS controller and places overhead on file system responsiveness.

The result is that many data centers end up buying a new NAS system before their current one is at maximum capacity. In fact, 50% full is a common standard used to stop adding additional data to a NAS and to buy a new one.

A potential solution is object storage. These systems allow for a capacity of 80% or more yet are not bogged down by complex metadata operations. The object file system is essentially flat -- you don't create complex paths to data. Each file or object is assigned an ID or serial number and access to the object is done through that number.

The problem with object-based storage is that systems that use it are targeted at large cloud providers with billions of files. Although the cost to purchase and manage a large system is very compelling based on cost per gigabyte, the initial buy-in is beyond the grasp of most organizations.

We are seeing the emergence of file systems and object storage systems that are designed to start very small but offer similar expansion. They also often offer common interfaces -- NFS, CIFS, iSCSI -- to the object storage system instead of requiring a REST API to get to files. Some can be installed as a virtual appliance and others are scale-out designs that can start with a single node.

Unstructured data is going to be a challenge for data centers of all sizes. Even Storage Switzerland struggles with this. When we produce videos at events like VMworld we create terabytes of data in a few days' time. And like everyone else we store this data forever. The time to explore object storage for more traditional use is now, before you end up with dozens of NAS systems.

Comment  | 
Print  | 
More Insights
Register for InformationWeek Newsletters
White Papers
Current Issue
InformationWeek Elite 100 - 2014
Our InformationWeek Elite 100 issue -- our 26th ranking of technology innovators -- shines a spotlight on businesses that are succeeding because of their digital strategies. We take a close at look at the top five companies in this year's ranking and the eight winners of our Business Innovation awards, and offer 20 great ideas that you can use in your company. We also provide a ranked list of our Elite 100 innovators.
Video
Slideshows
Twitter Feed
Audio Interviews
Archived Audio Interviews
GE is a leader in combining connected devices and advanced analytics in pursuit of practical goals like less downtime, lower operating costs, and higher throughput. At GIO Power & Water, CIO Jim Fowler is part of the team exploring how to apply these techniques to some of the world's essential infrastructure, from power plants to water treatment systems. Join us, and bring your questions, as we talk about what's ahead.