Unexpected disruptions can happen to anyone -- your own company or the vendors whose services you use. Here are three things to learn from the latest Facebook outage.

John Beattie, Principal, Sungard Availability Services

October 20, 2021

4 Min Read
iQoncept via Adobe Stock

Six hours. That’s roughly how long the Facebook, Instagram and WhatsApp outage lasted on Oct. 4 (and Oct. 5, depending on your time zone), creating widespread confusion and frustration across the globe.

The social media giant released a statement -- later followed by a detailed explanation -- about the root cause, chalking it up to configuration changes to the backbone routers that coordinate network traffic between its data centers, leading to further complications and, ultimately, freezing all Facebook services. Everyone from Facebook’s in-office and remote work teams to the users of its many apps and products were shut out. And then four days later, Facebook experienced another outage.

It serves as a reminder that unexpected disruptions can happen to anyone – your own company or the vendors whose services you use. Here are three things to learn from the latest Facebook outage:

1. Don’t put all your resources in one place

The global reliance on Facebook services can’t be overstated.

Facebook has over 2.8 billion users. Businesses use it for large chunks of their marketing and sales initiatives. In developing nations, the stakes are even higher. Citizens around the world depend on Facebook, Facebook Messenger, and WhatsApp to deliver essential government, healthcare and education services. But relying heavily on a single service – or even just a few – for all your needs leaves you vulnerable if one of them experiences a disruption.

Consider your concentration risk and the potential resilience limitations of your third-party partners. How vulnerable would you be if outages like this happened to one of your major vendors? In the end, you might want to think twice about putting all your eggs in one basket.

2. Set up multiple lines of communication

Think about what this must’ve been like for Facebook employees. Everything they need for work runs through Facebook. And, just like that, it’s all gone.

Most organizations are using -- or plan to use -- a hybrid work model moving forward. This may increase productivity and flexibility, but it’s also harder to support and maintain, especially in a crisis. An outage that halts internal communication can be a costly disruption if you don’t have a ready alternative, like a company directory of phone numbers.

If work from home (WFH) is going to be part of your future, make sure everyone has the right resources to execute crisis plans when needed. As part of your overall business resilience and continuity plans, set up multiple lines of communications so you can relay important updates and information during disturbances.

Test this plan regularly so your employees are well-versed, and make sure to fix any speed bumps ahead of time. Communication is paramount to recovering quickly from a disaster.

3. ‘The best ability is availability’

It’s not the first time Facebook and its services went down. It’s not even the longest. But it’s a pattern that could have severe consequences if it continues.

In 2019, when discussing the company’s frequent bouts of downtime with employees, Mark Zuckerberg said, “… it’s really important that these services are reliable. Even from just a competition standpoint, what we see is that when we have downtimes in WhatsApp or Instagram Direct, there are people who just don’t come back.”

He’s absolutely right.

According to a survey conducted by OnePoll on behalf of Sungard AS, consumers have little patience for disruptions. Fifty-five percent of respondents revealed they switched a service provider or reduced their service levels because of tech problems during the pandemic.

Now’s the time to reevaluate your business resilience plan. If you experienced the same kind of outage as Facebook, what would be the impact on your business? What points of failure in your systems should you be hedging against? Who would you call during a disaster? How long would it take remote employees to arrive onsite if needed? How long would it take you to return to business as usual?

Run simulations for varying types of disruptions to gauge your organization’s preparedness for each scenario and keep your business continuity and disaster recovery plans up to date with changes in your working conditions and production environment.

Business disruptions can result in loss of customers and revenue, as well as reputational damage. Addressing your operational resilience now can help you achieve your greatest ability: availability.

Turn a Teachable Moment Into Action

The Facebook outage is a gentle reminder of the price of downtime and the consequences of failing to recover in a timely manner. It’s also an opportunity to take steps to prepare for a disruption of your own or among your partners.

Addressing concentration risk, communication and operational resilience aren’t the only measures you can take. But they’re great places to start.

About the Author(s)

John Beattie

Principal, Sungard Availability Services

As a Principal Consultant at Sungard Availability Services (Sungard AS), John Beattie works closely with organizations to implement third party risk management programs and reduce operational risk by establishing new business continuity and disaster recovery programs -- or transforming existing ones to improve effectiveness. Prior to joining Sungard AS, John was the Global Director of Business Continuity for News Corporation where he worked closely with Dow Jones, Fox film, Fox television, Harper Collins, NY Post, MySpace and many more familiar brands.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights