Midmarket firm Hit Promotional cut its backup window from around 17 hours to two. The CIO explains how it streamlined disaster preparedness in its virtualized environment.
10 Everyday Android Apps For SMBs
(click image for larger view and for slideshow)
Ask Hit Promotional Products CIO Eric Shonebarger about his biggest pain point and you'll hear a familiar answer: Data, and lots of it.
The branded promotional gear wholesale distributor has grown rapidly in recent years. At the same time, the business has increasingly asked IT to connect disparate systems--even if those systems were never intended to interact. Compounding the challenge: Shonebarger's 10-person team, most of whom are focused on development, serves 900 employees and counting. All told, Shonebarger's department must manage about 24 TB of data, triple the amount it had three years ago--no small chore for a midmarket firm.
"When you're dealing with both [unstructured and structured data], it's just a massive undertaking," Shonebarger said in an interview. "To make sure the data is up-to-date, to get it where it needs to belong at the right time, to store it, back it up, retrieve it--all of that is getting more complicated."
Virtualization is another part of the story--more than 90% of Hit Promotional's IT infrastructure runs virtually, and that will soon hit 100%. That wasn't always the case. The company first dipped its toes into the virtual pool back in 2007 when it turned an extra server into a virtual host to test the technology. "Before we knew it, we had 15 or 16 [virtual machines] running on it, and no redundancy," Shonebarger said. In 2009, Hit Promotional made a significant investment to take everything but its email system virtual, deploying a VMware environment on Dell servers, backed up to an EqualLogic storage area network (SAN). For all of virtualization's benefits, the transformation shined a light on some flaws in Hit Promotional's data strategy: Like plenty of other small and midsize businesses (SMBs), it tried to retrofit existing technology and processes to a new world order.
"We were treating those virtual machines just like they were physical machines," Shonebarger said. That was particularly true of Hit Promotional's existing backup and recovery methods--and therein lies an early lesson learned: A significant change like moving from physical to virtual infrastructure has wider implications than a few servers or desktops. The company had been using Symantec's Backup.exec software, but Shonebarger encountered a significant problem after the virtualization makeover. The backup proxy required a direct connection to the storage logical unit number (LUN).
"That's really, really dangerous to have, say, a Windows box with full-blown access to do whatever it wants to VMware LUNs" Shonebarger said. In that scenario, based on an actual one that occurred at Hit Promotional, if an administrator were to provision more storage locally, those LUNs would appear in Device Manager as if they were the same as USB drives. "Hey, there's a 2 TB LUN here, would you like to format it? All it takes is a single admin to say 'yes' and your whole LUN's gone. And VMware's resilient enough that it won't even tell you there's a problem until it runs out of disk space, can't put stuff into cache anymore, and takes a complete dive on you," Shonebarger said.
The breaking point: Shonebarger realized that a total IT disaster requiring a complete restore would likely take days to recover from, a scenario his business couldn't afford. Hit Promotional tried out a few different options and settled on Veeam, which builds its backup and recovery platform specifically for virtual environments. The first key gain: In that total restore scenario, Shonebarger said data interruption to his users would be a matter of seconds rather than days. That's enabled by Veeam's vPower feature, which will run a VM directly from Shonebarger's disk-based backups while the SAN is restored simultaneously. It becomes a see-no-evil, hear-no-evil proposition for end users.
"[As] IT, we're sometimes aware of problems or mistakes that, if we can keep it from the users and they can still have their normal experience, they don't care if it takes me a day to recover from failure," Shonebarger said. "They just want to know: Is my data available, is the system up and running, and can I get my job done?"
Similarly, Shonebarger has cut down a 17-to-18-hour window for a full SQL database backup--which rendered the database effectively unusable--to around 2 hours. The Veeam platform has also enabled significant compression of Hit Promotional's growing data; the company is in the process of upgrading its SAN, but can currently back up its 24 TB to a target that only actually holds 16 TB, with 5 TB still free.
Shonebarger offers some parting shots of advice for fellow SMB execs: He beats the recovery testing drum, for one. In his view, too many SMBs back up data without ever pondering the total disaster scenario--and what it would take to recover those backups effectively and efficiently. More importantly, he thinks there's a common misconception that replication equals safety.
"Don't confuse high availability with backup and recovery," Shonebarger said. By that, he means deploying a replicant SAN, for example, and thinking your job is done. While redundancy can be a good thing, it's no use if the original data is bad. The aforementioned LUN-formatting scenario is one example in the bad data category, and there are plenty of others.
"That's not smart," Shonebarger said. "That can come back to bite you in a pretty bad way."
Most external hacks of databases occur because of flaws in Web applications that link to those databases. In this report, Protecting Databases From Web Applications, we'll discuss how security teams, database administrators, and application developers can work together to improve the defenses of both front-end Web applications and back-end databases to prevent these attacks from succeeding. (Free registration required.)
InformationWeek Elite 100Our data shows these innovators using digital technology in two key areas: providing better products and cutting costs. Almost half of them expect to introduce a new IT-led product this year, and 46% are using technology to make business processes more efficient.
The UC Infrastructure TrapWorries about subpar networks tanking unified communications programs could be valid: Thirty-one percent of respondents have rolled capabilities out to less than 10% of users vs. 21% delivering UC to 76% or more. Is low uptake a result of strained infrastructures delivering poor performance?
. We've got a management crisis right now, and we've also got an engagement crisis. Could the two be linked? Tune in for the next installment of IT Life Radio, Wednesday May 20th at 3PM ET to find out.