A technology outage at the golden arches serves as a reminder of the growing complexity of the supply chain.

Carrie Pallardy, Contributing Reporter

March 21, 2024

5 Min Read
McDonald's entrance sign, golden arch
Greg Balfour Evans via Alamy Stock

If you wanted to order a Big Mac on March 15, you might have been out of luck. McDonald’s experienced a global technology outage, which it attributed to “a third-party provider during a configuration change,” according to an update from Brian Rice, its EVP and global CIO.  

McDonald’s did not share the exact nature of that configuration change, but the widespread impact it had reflects how third-party changes can cascade on increasingly complex technical infrastructures.  

What lessons can enterprise leaders take from the McDonald’s outage as they consider the growing complexity of their tech stacks?  

Configuration Confusion 

Configuration changes are a normal part of the day-to-day of improving a technical infrastructure’s functionality and performance. With so many moving parts and different parties involved, a seemingly minor issue can have a widespread impact.  

“Something as simple as a typo in the address of a database server -- even if corrected immediately -- could trigger a cascade of failures that spread through the network,” Jaushin Lee, PhD, founder, president, and CEO of zero-trust solutions company Zentera Systems, explains in an email interview.  

Who is making what changes and when isn’t always clear. Tracking down an error and fixing it can be akin to working your way through a maze. “For McDonald's, it's very likely that a lot of the information about the actual incident was not available in a timely fashion from the multiple of third parties that are part of the entire service offering,” Jim Routh, chief trust officer at Saviynt, a cloud identity and governance company, tells InformationWeek.  

Related:Downtime Cost of Cyberattacks and How to Reduce It

Whatever the configuration change was, it led to widespread issues at McDonald’s locations. Some restaurants couldn’t take orders or process payments; some attempted to work around the issue by writing orders down and taking cash, BleepingComputer reports.  

On March 16, McDonald’s shared in its update that its restaurants around the world were back in action. But even hours of downtime have an obvious impact on revenue. “But that’s just direct losses -- damage to the brand and other factors could make such an outage even more painful,” says Lee.   

Tech Stack Visibility  

A third party inadvertently causing this widespread of an outage is a stark example of third-party risk. Technical supply chains are growing more complex, not less. “There'll be more enterprises that will have similar kinds of outages, unfortunately,” Routh says.  

“In the coming days, we will be analyzing the issue and pushing for accountability across our teams and third-party vendors,” McDonald’s pledged in its update. How can enterprise leaders understand the potential risk surrounding configurations and mitigate it? 

Related:Why Cyber Resilience May Be More Important Than Cybersecurity

Increasing visibility is critical. “The third-party service provider who’s providing application functionality may see part of the logs and the cloud service provider may see another part of logs, but nobody has the full view of what the actual digital consumer is doing in use of the app and how the apps necessarily are configured to work across platforms,” Routh explains.  

Identity access management can also be fragmented. Enterprises, cloud service providers, and other third parties may all have different approaches.  

It is up to enterprise leaders to work with their third parties to gain visibility into who is making what changes that could impact their ability to operate. “This is a great opportunity for McDonald's to push through some fundamental changes in their third-party governance program that addresses the risk from configuration management,” says Routh.  

Cybersecurity Risk 

McDonald’s emphasized that this outage “was not directly caused by a cybersecurity event.” But it is important for enterprise leaders to understand that configuration issues can lead to cybersecurity incidents.  

Related:Sign Up for InformationWeek's New Cyber Resilience Newsletter

“Badly configured applications can leave the application vulnerable to any number of CVEs,” Adam Rice, senior security engineer at managed cybersecurity platform Huntress, points out.

He advocates for reviewing third-party vendors using the CIA triad: confidentiality, integrity, and availability. What kind of controls do vendors have in place to mitigate the risk of failing any one side of that triangle?  

“Business needs to define what exists within those controls and then they need to audit third parties against their own satisfaction against those controls,” Huntress's Rice explains. “If anything fails those controls, they really need to review whether the benefits to the business outweighs the risks and then put compensated and controls in place for those risks.”  

But even the best laid plans can and will fail. What happens in the event of an outage, linked to a cybersecurity incident or otherwise?  

“The communication necessary across multiple third, fourth, fifth parties is rarely exercised as a scenario,” says Routh. “There's a clear opportunity to do that and to harvest the learnings from that practice into a better configuration management, and that will improve resilience despite the growing complexity of multiple supply chains today.” 

Lee highlights the importance of running tabletop exercises to prepare enterprise teams and improve resilience, offering Netflix as an example. “Netflix developed a great tool (Chaos Monkey) that randomly kills production instances,” he shares. “Netflix had great success using it to create small artificial outages, forcing their team to build resilient systems and procedures." 

With more outages like the one McDonald’s experienced likely to happen to other enterprises, that resilience is a vital consideration for leadership teams. “If you're anywhere in the C-suite and you ask the question, ‘Is the system up?’ you have to recognize that is the wrong question to ask. The right question is, ‘Is the system resilient?’” says Routh.  

About the Author(s)

Carrie Pallardy

Contributing Reporter

Carrie Pallardy is a freelance writer and editor living in Chicago. She writes and edits in a variety of industries including cybersecurity, healthcare, and personal finance.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights