Commentary

Howard Marks
 

Data Domain's DD120 Brings De-Duping Down To Branch Offices

In the four short years since Data Domain introduced their original DD240 appliance, hardware data de-duplication in the data center has evolved from interesting technology to accepted, if not yet standard, practice. While big enterprise data centers with petabytes of data and hundreds of terabytes of nightly backups are still more interested in raw speed than storage efficiency, most of us could improve our backup infrastructure significantly with de-duplication. With the new DD120, Data Domain brings the cost of data de-duplication down to the point where it makes sense for branch offices.

In the four short years since Data Domain introduced their original DD240 appliance, hardware data de-duplication in the data center has evolved from interesting technology to accepted, if not yet standard, practice. While big enterprise data centers with petabytes of data and hundreds of terabytes of nightly backups are still more interested in raw speed than storage efficiency, most of us could improve our backup infrastructure significantly with de-duplication. With the new DD120, Data Domain brings the cost of data de-duplication down to the point where it makes sense for branch offices.Using a replicating backup appliance at remote offices lets you use your existing backup software to view and manage backups across the organization and automatically replicate the backups to the data center. They minimize bandwidth usage by de-duplicating the data before replicating it and, unlike some remote backup solutions, they also provide a local copy of the data for fast restores.

The 1u DD120 uses three 250-GB drives to deliver about 373 GB of space for backups before de-duplication. While de-duplication ratios vary by data type, backup schedule, and phase of the moon, users running a weekly full, daily incremental schedule should see the DD120 hold more than 5 TB of backup data. Since Data Domian's software de-dupes in real time, no disk is "wasted" holding backup data while the appliance de-dupes. Data Domain claims the DD210 can ingest data at 150 GB an hour (using CIFS NFS or NetBackup's OST protocol), which means you should be able to run 1 TB or more of backups in an eight-hour window even with usual fudge factor to allow for vendor hyperbole.


More Storage Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

In addition to de-duping data locally, Data Domain also avoids sending data across the WAN that duplicates data already backed up at another remote office. When a DD120 is replicating to one of its Data Domain big brothers at headquarters, it first sends the hashes for its new data. The box at headquarters then sends back a list of blocks it hasn't seen so they can be sent over the line. So the 50 GB sales literature folder that's on every remote office's file server only gets sent across the line once. You also can schedule and throttle replication traffic.

I must say I was disappointed that Data Domain decided to use 250 GB drives in the DD120. New nearline 500 GB drives are just $70 each more than their 250 GB equivalents at NewEgg. Assuming the usual 3-4 to 1 manufacturer's markup, doubling the DD120's capacity would mean an additional $1,000 or so on the MSRP. I for one would rather pay $13,000 for an 800 GB appliance than $12,000 for a 373 GB one. After all, some remote offices, like say an insurance adjuster taking digital photos, could have a lot of data even if there isn't a high rate of change.

Now that Data Domain has tuned the new version of its software for de-duping small files, that insurance adjuster could save those photos to the Data Domain appliance directly, storing, de-duping, and replicating the data to headquarters all in one fell swoop.

Replicating de-duped backups is one good way to protect data at remote sites and the DD120 makes it affordable for many.


Related Reading




Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

InformationWeek encourages readers to engage in spirited, healthy debate, including taking us to task. However, InformationWeek moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing/SPAM. InformationWeek further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
T-Shirt Giveaway T-Shirt Giveaway: Each week we're selecting one great comment from our readers. The author of the comment will receive an InformaitonWeek Community t-shirt. So get posting!
Subscribe to RSS

Resource Links