When Facebook's Down, Thousands Slow Down - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Cloud // Software as a Service
09:32 AM
Connect Directly

When Facebook's Down, Thousands Slow Down

When Facebook went down this week, thousands of websites linked to the social media site also slowed down, according to Dynatrace.

 Cloud Storage Devices: 6 Worth Your Money
Cloud Storage Devices: 6 Worth Your Money
(Click image for larger view and slideshow.)

An outage that took Facebook and Instagram off the air for an hour Monday affected 29 locations where Facebook operates servers. Curiously, its massive Prineville, Ore., data center complex appears to have remained in operation throughout the outage.  

That means a problem arose in the content distributing sub-data centers that Facebook has scattered around the US and around the globe, in both its own and colocation data centers. A map produced by an Internet metrics collecting firm, Dynatrace, indicates 29 such locations had their operations interrupted for an hour, starting about 9:10 p.m. Pacific time.

As a result, at least 7,500 Web sites that depend on a JavaScript response from a Facebook server had their operations slowed or stalled by a lack of response from Facebook. Of course Facebook users, who could access the service, couldn't get it to respond or do anything for them during the hour.

That's just one of the conclusions an observer can make after examining data from Dynatrace, which tracks website performance for major retailers, financial services ecommerce systems, and online operations for hundreds of enterprises.

Dynatrace has 100 computers around the globe collecting data from "tens of thousands" of headless users, real-world end-users who allow their computers to periodically fire off stored queries to Nike, Netflix, and thousands of other online destinations. The client machines capture the response time and report it to Dynatrace. That allows it to report on application performance to their customers, which include Wells Fargo, LinkedIn, Cisco, Thomson Reuters, and Intuit.  

Another conclusion is that the outage was not caused by a cyberattack, even though a group that wanted to claim credit started issued tweets claiming responsibility. Instead, Facebook 'fessed up to a configuration change gone awry.

"This was not the result of a third-party attack but instead occurred after we introduced a change that affected our configuration systems," according to Facebook's statement.

From its position astride the Internet, Dynatrace said the slowdown of sites that use the familiar Facebook link "Like this page," or are otherwise dependent on Facebook interactions, illustrates the vulnerability of businesses that rely on third-party links to their websites.

(Image: Wikipedia)
(Image: Wikipedia)

Vincent Geffray, a senior product manager at Dynatrace, said its Outage Analyzer service is a big data application sitting on top of the data routinely captured by its application performance management monitoring. Outage Analyzer spotted a slowdown Monday that was simultaneously occurring at the websites of Dynatrace customers and traced it back to their ties to Facebook. In some cases, a site allows a visitor to log in using his Facebook identity. In others it responds to a "like" recorded on the Dynatrace customer's site.

Dynatrace has 5,800 customers around the globe. Geffray said the Facebook slowdown occurred simultaneously around the globe. That suggests that the Facebook configuration change, the cited cause, may have been attempted to be implemented rapidly at several sites, spreading to other sites, or even implemented globally at the same time. The Dynatrace monitoring shows a sharp spike.

"We're working to get things back to normal as quickly as possible," Facebook spokeswoman Charlene Chian told CNN. Facebook visitors were not totally cut off from their favorite social media. "Sorry, something went wrong," they were told as they tried to access the site.

For retail and enterprise sites that use Facebook as a third-party service, however, the incident took on serious consequences. According to Dynatrace, the short delays that started to show up around 9:10 p.m. PT grew into 39-second delays before a "server not available" or other message was returned to users. The retailers and other businesses were available, but their full pages couldn't move to the next user interaction until the Facebook link finished loading its JavaScript.

[Want to learn more about how a Microsoft code update brought down Azure? See Microsoft Azure Outage Blamed On Bad Code.]

In some cases, the inability of the end user's computer to finish building a full page meant that his or her interaction with a target site would be very slow or stall completely.

"Let's say Nike is slow because of Facebook. The customer doesn't know that the degradation is due to Facebook. He just says, 'Nike is slow,'" Geffray said.

The problem exists with any social media service or other third party tied into a website's operation. If the full document object model called for by the download can't be built, due to absent JavaScript, the download may fail. Most websites are built with such interdependencies today. Their owners aren't always aware of the ways a third party might be slowing down the site.

Whatever the cause, Facebook rectified the issue within the hour, and sites began to recover normal operations. Facebook has had a strong reliability record on the whole. Its last major outage was five years ago and lasted for 2.5 hours.

Other social media provided a springboard to commenting on the situation. Twitter quickly spawned the hashtag, #facebookdown, where tweeters mocked themselves for not knowing what to do without being able to post selfies to Instagram or personal news to Facebook.

"are you kidding me? east bay emergency dispatch says 5 people called 911 during #facebookdown today!" tweeted Kristen Sze (@abc7kritensze).Reports that people were roaming the streets of Berkeley, shoving photos of themselves into strangers' faces, and asking if they "liked" them, were probably exaggerated.

Attend Interop Las Vegas, the leading independent technology conference and expo series designed to inspire, inform, and connect the world's IT community. In 2015, look for all new programs, networking opportunities, and classes that will help you set your organization’s IT action plan. It happens April 27 to May 1. Register with Discount Code MPOIWK for $200 off Total Access & Conference Passes.

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive ... View Full Bio

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
Newest First  |  Oldest First  |  Threaded View
User Rank: Ninja
1/29/2015 | 9:45:03 PM
continue to put facebook name before their own?
Since Facebook rectified the issue within an hour and its last major outage was 5 years ago and lasted for 2.5 hours, this speaks volumes about facebook's reliability. Not a bad handling of situation I must say!

The only question worth exploring is: Will the brands like NIKE continue to put facebook name before theirs? That is will they promote nike.com or facebook.com/nike
User Rank: Ninja
1/29/2015 | 2:41:00 PM
America Offline
I am amused by how little computing culture changes.  America Online went offline for over a day back in the late 1990s and caused all manner of business havoc.  I figure these sorts of problems are an excellent reason to avoid dependency on online services in the business realm (they might be nice, but they should never be a necessity).

User Rank: Author
1/29/2015 | 8:24:04 AM
Re: People slowed down too
I have to say, I didn't even notice the outage. The hype from some pubications on this are akin to the hype we had about the snow storm in the northeast. It was said to be "historic" until it happened and proved to be just another snow storm.
User Rank: Strategist
1/28/2015 | 4:56:15 PM
Re: People slowed down too
Yeah, I'd be curious to see the level in increased productivity during that same hour.
Charlie Babcock
Charlie Babcock,
User Rank: Author
1/28/2015 | 2:42:53 PM
Applicaton performance management just got harder
Enterprise application performance used to be about getting your application right. Now it's about getting your application and dozens of third party applications and services right. It's possible for third party code to be slowing down your site and you can't see it. Everything looks like it's functioning the way it's supposed to in-house.
User Rank: Ninja
1/28/2015 | 11:39:27 AM
People slowed down too
I think a lot of people 'slowed down' as well. There were certainly less people saying to me "did you see X on Facebook", for a couple of hours. It was quite refreshing!
2021 Outlook: Tackling Cloud Transformation Choices
Joao-Pierre S. Ruth, Senior Writer,  1/4/2021
Enterprise IT Leaders Face Two Paths to AI
Jessica Davis, Senior Editor, Enterprise Apps,  12/23/2020
10 IT Trends to Watch for in 2021
Cynthia Harvey, Freelance Journalist, InformationWeek,  12/22/2020
White Papers
Register for InformationWeek Newsletters
Current Issue
2021 Top Enterprise IT Trends
We've identified the key trends that are poised to impact the IT landscape in 2021. Find out why they're important and how they will affect you.
Flash Poll