There is no question that e-mail is mission critical in most organizations. When e-mail goes down, workers become much less efficient. And as workforces become dispersed by being more mobile, the reliance on e-mail extends to those workers who expect to remain e-mail-connected no matter where they may be. So yes, as workers become more mobile, wireless e-mail becomes more mission-critical to those workers and if wireless e-mail goes down, so does the organizational efficiency.
Why, in your estimation, did the BlackBerry service experience such a prolonged outage? Was this an issue with just their network management? Or has RIM failed to manage its rapid growth effectively?
It is not yet clear what exactly went wrong (RIM hasn't said). But it is clear that the NOC is the point of failure (that is why all devices on all carriers went out -- not just a selected carrier, which would indicate an outgoing connection specific to that carrier). It seems to me that a scalability issue (i.e., not having enough capacity at the NOC for all the users) would have resulted in problems, but probably not the collapse of the entire service. And RIM did get it up and running reasonably quickly (under 12 hours or so). So I would suspect either a failed hardware component, or a software glitch in the NOC (such a thing has occurred before a few years ago). What is a bit troubling is that RIM had no back-up NOC to switch to when the fault occurred (out of fairness, a back-up NOC for fault tolerance on a scale of RIM's installed base would be very expensive to set up and maintain). But a more distributed NOC environment might be a way for RIM to minimize disruptions in the future.
Was this outage made worse by a lack of contingency planning by corporate IT departments?
Without a doubt. Few companies had backup plans to deal with this issue. Although e-mail was down, the BlackBerry phone functions continued to operate normally, so workers could be contacted and could contact the company. As a backup, companies should have deployed a way to send out simple SMS text messages to their users' phones with an alert that the system was down and to use the phone for critical messages. Same for internal e-mail users. That would have at least alerted everyone to the problem. BTW, this would be a good strategy for RIM to undertake, as well as a backup alerting mechanism (though it might be difficult for them to do this, as they do not have a complete listing of all the phone numbers of their subscribers).
How can CIOs and IT managers better prepare for an outage of their BlackBerry systems?
It is hard, since so much of the function of modern organizations depends on e-mail. But a stated contingency plan that explicitly looks at options (e.g., SMS text messaging, and voice calls) would be a good idea in case of need. But the wireless e-mail systems are pretty reliable and nothing is 100% fail-proof, so too much emphasis on this backup strategy (i.e., spending lots of money or time) may not be a good investment
Do you think BlackBerry's recent outage will prompt companies to begin looking at other push e-mail solutions?
Maybe. But switching to an alternative solution is expensive ($845 per user, by our research). Further, some other solutions (e.g., Good/Motorola) also go through a NOC. And non-NOC solutions, like the Microsoft solution which requires that companies be on the latest Exchange version and have a Windows Mobile handheld, require a significant upgrade and/or change in infrastructure. So I think some companies will look, but I'd advise the majority of companies to stay where they are. For the most part, the service is fairly reliable and even internal e-mail systems (e.g., Exchange, Lotus Notes) go down on occasion and few companies will rip them out and replace them. So companies should go slow and consider all options and look at all risks and rewards before making a major move.
Do you think we're going to see more BlackBerry outages like this one in the near future?
Hard to say without knowing the exact cause of this failure. I would hope that RIM will learn from this and eliminate the cause of the failure for the future. But there are no guarantees that another fault will not spring up. As long as we have computer systems, we will have outages. That is just the way it works. Our phone systems go out, as do our network/Internet connections. Technology is just not perfect.