What Microsoft Azure, CloudFlare Outages Show
Recent cloud outages underscore that widespread chaos can result from infrastructure weak points that are not dispersed.
On March 3, a router glitch caused Web infrastructure service CloudFlare to disappear from the Internet for more than an hour, cutting off every website protected by the firm's widespread network. A week earlier, every customer of Microsoft Azure lost secure access to the service's storage network for half a day, after three critical certificates were accidentally allowed to expire.
While companies deal with system failures every day, single points of failure in cloud services and infrastructure do not just impact a single company. Every firm that uses an affected cloud provider's service will face a potential outage. The recent cases underscore the dangers of cloud services taking down dependent parts of the Internet if those services do not seek out and disperse points of failure, said Matthew Prince, co-founder and CEO of CloudFlare.
"The challenge, when you have these systemic problems, is that -- when, in Azure's case, they let the domain certificate expire and it affected all their customers at once or, in our case, when we hit a bug with the software that is running on our routers -- instead of having a very limited impact, you have an impact against all the service, generally," Prince said. "Even though the time of the impact may be short, it impacts a large number of customers."...