Commentary

George Crump
 

Data Deduplication Will Not Become A Feature

As data deduplication matured last year, the constant question I was asked by industry analysts was "Isn't this just a feature?" The question implied that anyone that was specifically in the data deduplication space was going to be erased by the larger manufacturers as they added deduplication to their offerings. It seemed logical, but hasn't occurred. The major manufacturers have struggled putting together viable strategies for data reduction and, to some extent, it's really not in their best interests to reduce the amount of storage required.

As data deduplication matured last year, the constant question I was asked by industry analysts was "Isn't this just a feature?" The question implied that anyone that was specifically in the data deduplication space was going to be erased by the larger manufacturers as they added deduplication to their offerings. It seemed logical, but hasn't occurred. The major manufacturers have struggled putting together viable strategies for data reduction and, to some extent, it's really not in their best interests to reduce the amount of storage required.The biggest challenge? For data deduplication to work well, it needs to be tightly integrated into the existing operating system of the disk itself. If you have a storage array OS whose source code is 3, 4, or more years older, then integrating a dramatically new way of placing data on that disk is going to become quite complex. The work-around to this problem is to do what is commonly called a post-process deduplication step. Post-process data deduplication walks the disk at certain intervals to determine if there are redundant areas.

The challenges with this method are that it creates two storage areas to manage, an area that is waiting to be examined for duplicates and an area for the examined data. It also delays time to create a DR copy of data. A common use for deduplicated systems is to leverage their ability to only store unique data segments and replicate only those new segments to the remote location. With the post-process method, you have to wait until the deduplication step is complete, until data can be replicated. The post-process step can be very time consuming and delay the update of the DR site by 6-10 hours.


More Storage Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

As a result, companies that started with data deduplication as a core (Data Domain,Permabit, Diligent) part of their technology have a distinct advantage. The other companies will have to make the post-process data deduplication much more seamless than it is today, exit from deduplication altogether, or re-write their code bases to support in-line data deduplication.

George Crump is founder of Storage Switzerland, an analyst firm focused on the virtualization and storage marketplaces. It provides strategic consulting and analysis to storage users, suppliers, and integrators. An industry veteran of more than 25 years, Crump has held engineering and sales positions at various IT industry manufacturers and integrators. Prior to Storage Switzerland, he was CTO at one of the nation's largest integrators.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links