While You Were Sleeping

Eighty percent of system failures are caused by failed changes that disrupt systems, or by unauthorized or unplanned changes. Having an enforceable change control policy in place can curb such failures -- so you can sleep at night

InformationWeek Staff, Contributor

September 4, 2007

6 Min Read

Long ago and out of necessity, CIOs grasped the importance of enterprise production planning and, in particular, scheduling maintenance during those times when transactions were at their lowest levels of activity.

As a best practice, it only makes sense for IT to process the thousands of changes, migrations, and occasional patches during off hours so that business can be conducted without fear of disruption. Best of all, change management planning can be regularly scheduled and enforced, according to your business cycle. Whether internal or outsourced, during second or third shift, following the end of a quarter or after the high sales season, all serve as logical maintenance windows. Whichever approach, the dynamics and velocity of today’s marketplace requires IT to be controlled and responsive.

While this is good and necessary, production planning is only part of the solution of protecting the ability to conduct business without interruption. You have to drill down a little further to see the things that will really have you tossing and turning at night. Because while you are sleeping, when all the changes, migrations, and patches are most likely being processed, a single unauthorized or unintended change can quickly, and frighteningly easily, bring down your company's e-mail, voice mail, network, payroll processing system, or ability to process orders or ship products -- and with it, the ability to remain in compliance or do business the next day.

Gartner Research recently reported that 80% of system failures are caused by failed changes that disrupt systems, or by unauthorized or unplanned changes made intentionally or in error by humans. Having a consistent, enforceable change control policy will prevent most of these systems failures from occurring. Yet eliminating unauthorized or unplanned change is something that challenges business leaders and CIOs. This may result from a lack of visibility, absence of a consistent governance process, or a reluctance to establish "too" much control. Coincidentally, this challenge also can result with CIOs having very dark circles under their eyes.

Unfortunately, what these executives may not know is that it is precisely during emergencies and system failures that it is most critical to enforce change control policies and procedures. This is when systems are at their most vulnerable, when unauthorized or quick changes can cause the most harm, and when failed changes -- and their effect on infrastructure -- are most likely to occur.

Wake Up and Smell the Control Process
How do you strike the right balance of control and agility? You can start with planning and a consistent tone at the top. By that, I mean implementing a clear, auditable, and enforceable change control process that has visibility and buy-in from the executive team. Ensure that the process is end-to-end, from authorization to testing to a clear recovery plan for those events that don't go as planned.

This control process, in its simplest terms, can be defined as all changes must be identified, authorized, and approved for production, no exceptions. Planned events are scheduled and placed through the rigor of a quality assurance process. The same holds true even in emergencies and outages, where changes must be done quickly, but in a controllable and auditable manner. In this way, well intended changes made in the 'heat of battle' -- at late hours, without senior management on site -- have a greater chance of not wreaking havoc within your business. Planned activity and emergency changes are orchestrated with the needed level of control and agility.

There are multiple layers to an effective process. This means that CIOs have a governance process and structure of checks and balances, one that establishes workflow approvals, in concert with the business, and with predetermined levels of authorization. For instance, there needs to be a clear segregation of duties between the person performing the work and the person who manages production changes. Monitoring and updating system access, especially as staff is moved or promoted, is another way of ensuring that only those who are authorized to make changes do so, including the CIO! Access and control is about much more than just keeping systems up and running. It's also about maintaining regulatory compliance and security, which for many companies, with the emergence of Sarbanes-Oxley, is a key metric at board and audit committee levels.

Choose to Snooze, Soundly
You're probably wondering how IT can possibly respond to a normal, last-minute change to a promotion from marketing if every change has to be so darned planned and methodical. I know fully well that IT has to be responsive and time sensitive to be of value to the business. That's the beauty of having control policies and procedures in place. You won't be constantly firefighting or struggling to keep the systems up and running. You will minimize being plagued with unplanned work. You won't have the reputation of not being able to get things done for the business.

In fact, you'll have quite the opposite experience. Your staff will have greater capacity to focus on the strategic work, and have the confidence to implement last-minute requests, whether it's a software upgrade to ward off a virus, or a big, last-minute promotional event for Super Bowl Sunday. Your organization will have achieved that essential balance of control and agility, as well as eliminating the exhaustion suffered by IT teams trying to recover from a major outage. By the way, I am sure that the sales manager won't miss having to make the call to a customer saying, "we can't process your order."

To those CIOs who don't have established control policies, I have a suggestion. Spend those restless midnight hours thinking about how your staff handles a change gone wrong. Do they dial your number? In that case, be prepared, because the IT bugs will surely bite at night.

Sweet Dreams
To those CIOs with established cultures of change (changes are authorized, approved, auditable -- no exceptions), you know that your staff is prepared to follow the procedures, policies, and checks and balances that are firmly in place to prevent a rogue change from taking down your systems. You know that enforceable change control enables your team to be agile enough to handle an emergency when necessary. To all of those CIOs, I say, "Sleep tight."

Tom Lesica is the former group VP of business operations and technology for Avaya. Before joining Avaya, Tom was chief operating officer and CIO of NewRoads. During Tom's tenure he also served as CIO for the Pepsi-Cola Co. as well as senior VP and CIO for J.Crew. Since 1981 when he joined IBM, Tom has held a diverse set of executive and leadership roles in the public and private sectors, across several industry verticals. He serves on the Tripwire Advisory Board, in addition to several other boards and consulting engagements.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights