June 9, 2021
LONDON — Fastly, the company hit by a major outage that caused many of the world’s top websites to go offline briefly this week, blamed the problem on a software bug that was triggered when a customer changed a setting.
The problem at Fastly meant internet users couldn’t connect to a host of popular websites early Tuesday including The New York Times, the Guardian, Twitch, Reddit and the British government’s homepage.
“We experienced a global outage due to an undiscovered software bug that surfaced on June 8 when it was triggered by a valid customer configuration change,” Nick Rockwell, Fastly’s senior vice president of engineering and infrastructure, said in a blog post late Tuesday.
He said the outage was “broad and severe” but the company quickly identified, isolated and disabled the problem and after 49 minutes, most of its network was up and running again. The bug had been included in a software update that was rolled out in May and Rockwell said the company is trying to figure out why it wasn’t detected during testing.
“Even though there were specific conditions that triggered this outage, we should have anticipated it,” Rockwell said.
San Francisco-based Fastly provides what’s called a content delivery network — an arrangement that allows customer websites to store data such as images and videos on various mirror servers across 26 countries. Keeping the data closer to users means it shows up faster.
But the incident highlighted how much of the global internet is dependent on a handful of behind-the-scenes companies like Fastly that provide vital infrastructure, and it amplified concerns about how vulnerable they are to more serious disruption.