04/25/2019

Facebook’s 14-Hour Outage and Why It Wasn’t Handled Properly

Anyone who uses Facebook, WhatsApp, or Instagram knows that Facebook suffered a major Internet outage this March and many are frustrated with the way said outage was handled. What happened? How could Facebook and its affected apps go down for such a long period of time? What should have been done differently? March’s Facebook fiasco is a perfect example of how you should never handle website downtime or a so-called server configuration change.

The Day Facebook Fell off The Face of the Earth

Facebook isn’t a stranger to performance problems or downtime issues, but on March 13th Facebook and some of its associated apps, including WhatsApp and Instagram, faced a whopping 14 hours of downtime. While some speculated the downtime was due to a DDoS attack, the official reason for the outage was a problem with a server configuration change. Apparently, this supposed server configuration change created a domino effect that resulted in the massive amount of downtime that users of the platforms had to endure.

The First Thing Facebook Did Wrong

What was the first thing that Facebook did wrong and why are so many people frustrated with the 14 hours of downtime that occurred? The lack of communication from the company played a large role in users’ frustration. During this 14-hour outage, Facebook only communicated with users twice via the Twitter platform. Once to say they were aware that there was a problem with the service and once more to deny that the problem was due to a DDoS attack. When the sites were all back up and running, the company merely stated the outage was due to a server configuration change. They would not, however, give any further details regarding the issue. In today’s day and age when it is crucial for websites to communicate with their user base when website downtime occurs, the way Facebook communicated with the public was seen as less than acceptable. Especially when one considers that Facebook’s whole business model revolves around communication.

The Second Thing Facebook Did Wrong

Figuring out the second thing Facebook did wrong is a bit harder than realizing they seriously lacked in communication skills and outreach regarding the issue. The simple “it was a server configuration change” excuse doesn’t really give anyone much to go on in terms of why everything actually crashed. A server configuration change shouldn’t result in such a massive amount of downtime – especially for a company as large as Facebook is and as advanced as it should be.

The lack of communication is causing some to believe that Facebook is lying about what went wrong. Those who believe the outage truly was caused by a server configuration change are left wondering if the company is incompetent. Why would a server configuration change cause a 14-hour outage? Was the change not tested on a test server? Was it rolled out all at once instead of regionally? Exactly what changes were being made? Why didn’t Facebook immediately do a rollback instead of allowing the site to be down for 14 hours? Furthermore, regardless of what caused it, why did it take 14 hours to fix it the problem? These are all questions that are buzzing around the Internet and Facebook is giving no answers. While it is apparent that Facebook did something wrong that caused the downtime, with such vague information being provided it’s hard to pinpoint exactly what went wrong and where.

Who Was Affected?

When downtime hits a website, people usually think of the money that specific site must have lost during such a long bout of downtime. However, advertisers who paid to advertise on Facebook were also affected as were stockholders as Facebook shares dropped about 1.8 percent the next morning. When a site like Facebook goes down, numerous people are affected. It’s not just Facebook itself. Considering this fact, one would think Facebook would be more careful when doing a server configuration change. Even public policy may be affected as the 14-hour bout of downtime and vague answers are being used as proof that there needs to be more legislative oversight in terms of the Internet.

What We’ve All Learned from The Facebook Fiasco

The bigger they hard, the harder they fall. This saying is proven to hold true in light of the 14-hour Facebook outage. It’s important to remember that no matter how big your site is, you’re never bigger than your supporters, customers, and users allow you to be. If Facebook “loses face” over this 14-hour outage, it can and will affect a large number of businesses and individuals. The more your website grows, the more you have a need for things like contingency plans, testing services, and a high-quality website downtime monitoring service. You also need communication plans to keep your visitors in the loop. The lack of communication about the 14 hours of downtime may have lost Facebook more fans than the downtime itself.