Commentary
10/28/2013
09:56 AM
Jim Ditmore
Jim Ditmore
Commentary

Why Do Big IT Projects Fail So Often?

Obamacare's website problems can teach us a lot about large-scale project management and execution.



By now nearly every American has heard about or witnessed the poor performance of healthcare.gov. Early on, only one of every five users was able to actually sign in to the site, while poor performance and unavailable systems continue to plague the federal and some state exchanges. Jeffrey Zients, the Obama appointee called in to fix healthcare.gov, promised on Oct. 25 that the site "will work smoothly for the vast majority of users" by the end of November.

Soon after the launch on Oct. 1, former federal CTO Aneesh Chopra, in an Aspen Institute interview with The New York Times' Thomas Friedman, shrugged off the website problems, saying that "glitches happen." Chopra compared the healthcare.gov downtime to the frequent appearances of Twitter's "fail whale" as heavy traffic overwhelmed that site during the 2010 soccer World Cup.

But given that the size of the signup audience was well known in advance and that website technology is mature and well understood, how could the government create such an IT mess? Especially given how much lead time the government had (more than three years) and how much it spent on building the site, (estimated between $300 million and $500 million).

This project failure isn't quite so unusual, unfortunately. Industry research suggests that large IT projects are at far greater risk of failure than smaller efforts. A 2012 McKinsey study revealed that 17% of lT projects budgeted at $15 million or higher go so badly as to threaten the company's existence, and more than 40% of them fail. As bad as the U.S. healthcare website debut is, there are dozens of examples, in both the government and private sector, of similar debacles.

In a landmark 1995 study, the Standish Group established that only about 17% of IT projects could be considered "fully successful," another 52% were "challenged" (they didn't meet budget, quality or time goals) and 30% were "impaired or failed." In a recent update of that study conducted for ComputerWorld, Standish examined 3,555 IT projects between 2003 and 2012 that had labor costs of at least $10 million and found that only 6.4% of them were successful.

Combining the inherent problems associated with very large IT projects with outdated government practices greatly increases the risk factors. Enterprises of all types can track large IT project failures to several key reasons:

Global CIO
Global CIOs: A Site Just For You
Visit InformationWeek's Global CIO -- our online community and information resource for CIOs operating in the global economy.

-- Poor or ambiguous sponsorship

-- Confusing or changing requirements

-- Inadequate skills or resources

-- Poor design or inappropriate use of new technology

Strong sponsorship and solid requirements are especially difficult to come by in a political environment (read: ObamaCare), where too many individual and group stakeholders have reason to argue with one another and change the project. Applying the political process of lengthy debates, consensus-building and multiple agendas to defining project requirements is a recipe for disaster.

Furthermore, based on my experience, I suspect the contractors doing the government work encouraged changes, as they saw an opportunity to grow the scope of the project with much higher-margin work (change orders are always much more profitable than the original bid). Inadequate sponsorship and weak requirements were undoubtedly combined with a waterfall development methodology and overall big bang approach usually specified by government procurement methods. In fact, early testimony by the contractors indicated a lack of testing on the completed system and last-minute changes.



Why didn't the project use an iterative delivery approach to hone requirements and interfaces early? Why not start with site pilots and betas months or even years before the Oct. 1 launch date? The project was underway for more than three years, yet nothing was made available until Oct. 1. And why didn't the government effort leverage public cloud capabilities (or approaches) to enable efficient scaling? And where was the horizontal scaling design within the application to enable easy addition of capacity for unexpected demand?

These techniques appear to have been fully missed in the website implementation. Furthermore, the website code appears to be sloppy, not even using common caching techniques to improve performance. So in addition to suffering from weak sponsorship and ambiguous requirements, this program failed to leverage well-known technology and design best practices.

One would have thought that given the scale and expenditure on this program, the government would have assigned top technical resources and applied those best practices. Now the feds are scrambling with a "surge" of tech resources for the site.

While I wish the new project leaders and implementers well, this surge will bring its own problems. Ideas introduced now may not be accepted or integrated easily. And if the project couldn't handle the "easy" technical work -- sound website design and horizontal scalability -- how will it handle the more difficult challenges of data quality and security?

What To Do?

Clear sponsorship and proper governance are table stakes for any big IT project, but in this case more radical changes are in order. Why have all 36 states and the federal government roll out their healthcare exchanges in one waterfall or big bang approach? The sites that are working reasonably well (such as the District of Columbia's) developed them independently. Divide the work up where possible, and move to an iterative or spiral methodology. Deliver early and often.

Perhaps even introduce competitive tension by having two contractors compete against each other for each such cycle. Pick the one that worked the best and then start over on the next cycle. But make them sprints, not marathons. Three- or six-month cycles should do it. The team that meets the requirements, on time, will have an opportunity to bid on the next cycle. Any contractor that doesn't clear the bar gets barred from the next round so that there's no payoff for a contractor encouraging endless changes. And you have broken up the work into more doable components that can then be improved in the next implementation.

Finally, use only proven technologies. And why not ask the CIOs or chief technology architects of a few large-scale Web companies to spend a few days reviewing the program and designs at appropriate points. It's the kind of industry-government partnership we would all like to see.

If you want to learn more about how to manage (and how not to manage) large IT programs, I recommend "Software Runaways," by Robert L. Glass, which documents some spectacular failures. Reading the book is like watching a traffic accident unfold: It's awful but you can't tear yourself away. I expand on the root causes of and remedies for IT project failures in my blog post on project management best practices. And how about some government IT projects that went well? Here's one site's top 10 for 2012.

What project management best practices would you add? Please weigh in with a comment below.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Email This  | 
Print  | 
RSS
More Insights
Copyright © 2021 UBM Electronics, A UBM company, All rights reserved. Privacy Policy | Terms of Service