September 6, 1999
Auction Site's Bid For High Availability
By Gregory Dalton
"We grew so fast and the volume grew so fast that we got ahead of ourselves," says Maynard Webb, an IT veteran who has left his position as CIO of Gateway Inc. to become president of eBay Technologies, which has responsibility for eBay's IT department and product-design and engineering units. "There are parts of the architecture that worry us. We haven't implemented the redundancy we need to be able to hide some of the things that happen."
The company has pursued some strategies for avoiding site meltdowns. It has, for example, steered clear of gee-whiz technology. "We don't use any bleeding-edge technology--no Java, no Corba," says Mike Wilson, eBay's chief scientist and one of its earliest employees. "We use a Mack truck." The problem, as eBay has acknowledged, is that it takes skill to keep an overladen truck from crashing.
The problem on Aug. 6 was a switch that went haywire at 4:45 a.m. Pacific time and took down eBay's network connection even though, Wilson says, the company has eight pipes out to the Internet. "We're designed for no single point of failure, but this particular device found a weakness," he says, declining to identify the specific technology or vendor involved. The Aug. 9 problem was different, but the company isn't talking about that, either. "Our outages never tend to be the same thing, which is incredibly disconcerting," Wilson says.
Since its first well-publicized outage in June, eBay has been hiring senior executives to bring some badly needed leadership to its IT organization. Bob Quinn, a former CIO at Sun Microsystems, was hired as CIO, and former IBM executive Mark Ryan was named chief technology officer. Both men will soon report to Webb.
To beef up its systems, eBay has added more servers and increased IT spending by an undisclosed amount. It's biggest single move since the crashes began was implementing a second database as a so-called warm backup. Webb says a top priority is to install a hot backup that can be used on a moment's notice.
Those moves will help, but eBay executives admit that improved management is the key to reducing future outages and better handling those that do occur. "Where we have stumbled is in operational excellence," Webb says. Wilson adds that the company's operations will be improved by focusing on simplicity in its system design and introducing more people who manage the process and discipline of the IT organization rather than create cool new features and functions.
Industry analysts say that's desperately needed because eBay's architecture has been cobbled together by young Web jocks who created brilliant front-end applications but lacked the skills needed to design heavy-duty information systems. The situation is so dire, some analysts say, that eBay may ultimately have to create an entirely new information architecture in parallel and then switch over to it at some point.
An eBay spokesman confirms the company is beginning to evaluate such a move. Meanwhile, a small army of consultants from Andersen Consulting, IBM, Sun, and other companies are camped out at eBay's San Jose, Calif., headquarters to try to help the IT department find its way out of the woods.
As Webb prepares to improve IT systems and management procedures, he consoles himself with the knowledge that eBay's troubles are "high-class problems," meaning they stem from the auction site's vast popularity. His advice for anyone seeking to learn from eBay's experiences: "You have to plan for a lot more success than you can dream of."
Illustration by John Bleck

ew companies have learned more about the need for strict crisis-management procedures than eBay Inc. The online auction house has been plagued by outages this summer, including one on Aug. 6 and another on Aug. 9. The company failed to build redundancy into its systems--and it's paying the price.
Return to main story, "Coping With E-Business Emergency."
Back to This Week's Issue