This growth strained the Merc's backup system. The exchange's infrastructure is made up of four tiers of storage: tier 1 for critical application data and databases that require high performance, such as trading systems; tier 2 for noncritical production databases; tier 3 for long-term archiving, such as that required for regulatory retention; and tier 4, a tape-based system for backup and recovery.
The sore point was tier 4. The Merc had two Storage Technology Corp. tape silos that were controlled by mainframe-based software. Tape backups were slow and resulted in a high rate of backup- and-restore failures. Also, there were only two silos supporting three data centers, so an additional silo would need to be purchased in the near future.
From Tape To Disk
The Merc hit upon the idea of replacing the tier 4 setup, with disks taking over for tapes as the backup-and-restore medium. Tape would become tier 5, used only for off-site storage and disaster recovery.
The exchange evaluated a number of standard ATA-based disk products for tier 4. All suffered from one defect: ATA drives are built to have a "duty cycle"--the percentage of time that a disk is spinning--of 25% to 50%. In the products the Merc investigated, the drives were spinning 100% of the time, meaning they were running at more than twice their duty cycle and, as a result, had a higher rate of failure. "What [the storage vendors] are doing is installing drives that have a 25% to 50% duty cycle in hardware where they're spinning 100%," says Craig Taylor, the Merc's associate director of open systems.
The Merc turned to Revolution 200T from Copan Systems. It employs a radically different approach: a massive array of idle disks, or Maid. In the setup, a disk only spins when a piece of data that resides on it is requested; the rest of the time, it's idle. Because the disks are idle most of the time, there's less chance they'll fail. "I've implemented an architecture with ATA drives that spins them down when not in use," Taylor says. "The access isn't real time, but it's backup data, so a 15-second delay is no problem."
With only up to 25% of the drives powered on at one time, service life is four times what it would be with conventional always-spinning disk systems.
Because Copan Systems' Maid platform is disk-based, it offers both latency and bandwidth performance advantages. The Revolution 200T has a bandwidth of 2.75 terabytes per hour. If the disk needs to be powered up, access is in seconds, so all data is effectively online.
Growth at the Chicago Merc caused it to rethink its backup-and-recovery system.
The Maid system works well with the Merc's backup software, Symantec Corp.'s NetBackup. To NetBackup, the Maid system appears as a "virtual tape" library. "NetBackup has no idea that these aren't tapes," Taylor says.
In the year since implementing Revolution 200T, the Merc has expanded its total Maid storage to 340 terabytes, with ample capacity for growth. The comparisons between the Maid system and the Merc's old tape backup system are stark: Backups and restores are up to three times faster, and no backups have failed, compared with a continuous series of failures on tape.
Taylor would like to extend the Maid concept to other tiers in the Merc's storage infrastructure, such as tier 3, which uses EMC Corp.'s Centera storage technology. "I'm putting long-term data on [Centera] boxes with spinning disks inside," he says, increasing the chances they'll fail. "That's not a good solution for long-term storage."