Big Data. Big Decisions
InformationWeek
Special Coverage Series

Commentary

George Crump

Keeping RAID Alive

When it comes to protecting data on disk, few technologies are more universal RAID; it faces challenges in the future data center, but is hardly alone in that.

When it comes to protecting data on disk, is seems that no technology is more universally applied than a Redundant Array of Independent Disks (RAID). RAID has two problems that are leading many to think that RAID has a limited future in tomorrow's data center. Vendors though are doing their best to address these issues and there are some sensible workarounds that will keep RAID a viable option for the foreseeable future.

The goal of RAID is to protect you from data loss if a drive fails and to provide that protection in a cost and capacity efficient way. RAID does this by striping your data across a group of drives, if one of those drives fail, data is still available to the users and applications because the other drives can calculate what was supposed to be on the failed drive. The failed drive can then be replaced and the data that was on the original drive can be rebuilt from the existing drives.

RAID has two problems that some think may make the technology unusable in the future. First, as the capacity per drive continues to increase the time required to rebuild a drive is taking longer than ever, now measured in days. This is important because for you not to experience data loss the drives have to complete a rebuild before another drive fails. It is also a problem in that the rebuild process usually takes a significant performance toll on the applications using that storage. This ties into the second problem which is that the likelihood of a multiple drive failure increases as capacity increases. There is statistic on drives called a Bit Error Rate. As capacities of drives increases the chances of errors on those drives increase as well. The combination of longer rebuild times plus an increasing likelihood of failures bring a perfect opportunity for data loss and have lead many to pronounce RAID dead. The suppliers though are saying "not so fast."

Faster Storage Controllers

When a drive fails in RAID a race is started to rebuild the data before another drive or two (with RAID 6) fails. A series of mathematical calculations are made to determine what was on the failed drive and then that data is written to that drive. The faster the math calculations and the writes are made the more overall system performance is impacted. With most RAID systems, you can throttle the rebuild time down so that application performance is not impacted but you do so at the risk of having a longer window of exposure. A simple way around this is for manufacturers to supply systems with controllers that have excess performance capability that can rebuild at full speed with little impact on application performance. While this is an improvement, you are still going to have to deal with some period of time where you are exposed.

More Storage Intelligence

Another option is to be more intelligent with the failure itself. For example, most arrays will start a rebuild after a drive has reached a certain threshold of errors. In this process, the whole drive is marked bad, basically off-lined, and then the rebuild starts. In a way, data is unnecessarily put at risk while the rebuild happens. Instead of failing the drive prior to the rebuild some systems have the intelligence to mark certain sections or even platters of the drive bad. Additionally some systems can keep the drive online and copy the good data to a new drive before failing the old one. Then have the new drive replace the old error prone drive in the RAID group. With this capability, data is safer and the copy is faster as no mathematical calculations need to be made.

Flash Storage

A final option may be the use of solid-state storage instead of mechanical hard drives. RAID rebuilds are read and write heavy operations that, in the mechanical world, involve high capacity drives. Solid-state storage is well suited to high read and write operations and, typically the capacity per drive or module is smaller. We have seen reported numbers of less than 30 minutes to rebuild a failed Flash module in a 5TB Flash storage volume. There has been some concern about RAID on Flash increasing chances of a wear out of the memory cells but several vendors have now designed Flash specific RAID algorithms that provide the protection against component failure and balance the write workload across the memory modules.

RAID may have its challenges in the future data center but it is not alone. As the state of the art in storage emphasizes greater performance and higher capacity in the same physical space, error rates will continue to increase. It is up to the storage suppliers to address these issues through additional controller and system intelligence so that the user is protected while being able to benefit from the advances in the technologies. The user's job is to understand what options are available and how each vendor is implementing them.

Follow Storage Switzerland on Twitter

George Crump is lead analyst of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. Find Storage Switzerland's disclosure statement here.



Related Reading


More Insights




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

BYTE encourages readers to engage in spirited, healthy debate, including taking us to task. However, BYTE moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. BYTE further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.

Follow InformationWeek

By The Numbers

What Are Your Primary Concerns About Using Big Data Software?

Base: 417 respondents at organizations using or planning to deploy data analytics, BI or statistical analysis software
Data: InformationWeek 2013 Analytics, Business Intelligence and Information Management Survey of 541 business technology professionals, October 2012

What Do You Think?

What's your attitude about SQL analysis on top of Hadoop?
We want fast, standard SQL analysis capabilities on Hadoop ASAP
Hadoop is for unstructured data; SQL is for relational databases
We'll give SQL on Hadoop a try, but relational DBs will remain the mainstay
Given strong SQL support on Hadoop, we'd nix the data warehouse
We're not interested in Hadoop
No opinion



Related Content

From Our Sponsor

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Five Big Data Challenges and How to Overcome Them with Visual Analytics

Business leaders often need a visual snapshot of data to quickly grasp and use it. This paper identifies five challenges in presenting data and how visual analytics can resolve them. Solutions are suggested to overcome the challenges of: speed, data clarity, data quality, displaying meaningful results, and dealing with outliers.

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Game-Changing Analytics: How IT Executives Can Use Analytics to Create Innovation and Business Success

Today's competitive advantage requires a deeper understanding of your business, your market and your customers. As an IT executive, you can drive that knowledge transformation. In this white paper, learn how to make decisions as a strategic business leader and three steps to begin an analytics initiative within your enterprise.

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

Data Visualization Techniques: From Basics to Big Data with SAS Visual Analytics

High-performance data visualization turns sophisticated analyses into meaningful graphics, leading to faster and smarter decision making. In this white paper, learn how visual analytics can transform big data, with additional features such as real-time functionality, mobile compatibility, robust applications for technical groups and accessibility for nontechnical users.

Big Data: Lessons from the Leaders

Big Data: Lessons from the Leaders

Financial performance, competitive advantage, operational efficiency, strategic decision making - every business goal can extract value from big data, and the time for doubt or inaction has long passed. In this Economist Intelligence Unit report, in-depth interviews with data pioneers reveal the link between the effective use of big data and the bottom line among other results.

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Decision-Driven Data Management: A Strategy for Better Decisions with Better Data

Which came first, the data or the decision? This white paper makes the case for having a decision in mind, then tailoring big data's volume, variety and velocity to achieve business results such as overcoming customer dissatisfaction or creating well-informed strategies in real time.

Informationweek Reports

Research: The Big Data Management Challenge

Research: The Big Data Management Challenge

The challenge of big data is real, but most organizations don't differentiate 'big data' from traditional data, and nearly 90% of respondents to our survey use conventional databases as the primary means of handling data. We'll help you understand what constitutes big data (it's not just size) and the numerous management challenges it poses.