12 ways to ace customer communications during a system outage

System outages are the worst nightmares for IT support teams, but they also provide an opportunity to stand out. During a major service outage, customers are often impacted a lot more because they have much less information about what is happening.

Some of the biggest outages that affected users all over the world last year include those of Slack, PlayStation, Airbnb, FedEx, and Amazon.

Slack appeared to be having issues for some people on October 1st, 2021. Users were unable to connect to the service, received errors while trying to send messages, and couldn’t access Slack’s website. Of course, the culprit was DNS (it’s almost always DNS). On Monday, October 4th, 2021 a network outage on Facebook together with its associated services – WhatsApp and Instagram, resulted in some classic ‘offline time’ for the entire world.

On June 8th, 2021, Fastly’s outage, which lasted almost an hour, caused major websites such as Amazon, eBay, Reddit, Spotify, Twitch, The Guardian, The New York Times, and even the UK government’s websites to be unavailable. In a similar incident, Akamai’s edge DNS had an issue that impacted quite a number of websites globally on July 22 nd, 2021 for about an hour.

But, we need to admit, all these issues were communicated beautifully. This brings us to a unique idea – the service recovery paradox. The better you recover from a failure in service, the higher customer satisfaction level you experience, than never having a failure at all. This is why downtimes, service degradations, and maintenance activities are the perfect opportunities for you to make a significant impact on your customer experience, by focusing on communicating clearly with customers about system outages.

12 outage communication best practices 

Communicating status updates during major incidents includes the most nerve-wracking moments. You’re under pressure, often with limited information, and writing for a really upset audience. What do you do?
The short answer – plan for surprises: think ahead about the most common types of outages, and come up with some templates. As you’re drafting your communications for an unplanned system outage, follow these twelve (12) best practices.

1. Acknowledge the issue

If any service availability issue or downtime impacts a significant number of your customers, send an initial message out – something that notifies your customer of the impact. Nothing shakes customer confidence like a status page that is showing “all good!” when major problems are occurring.

2. Be accessible

If your customers and partners never see your status message, then it is of no use. So wherever you convey your status updates, ensure that your customers know where to look. Link your status page in key locations like your contact page, support/operations pages, Twitter account, or other social handles.

3. Empathize with the customers

The last thing to do is use cliches like “we apologize for any inconvenience”. Share something more specific, honest, and to the point. This shows genuine care and understanding for your customers who are clueless and heavily affected by the downtime.

4. Communicate clearly

Your unique, handcrafted, one-of-a-kind message means nothing if your customers don’t understand what to derive from it. It’s not always possible, but the more clearly you can define who is being affected and in what ways, the easier you make it for your customers to understand.

5. Focus on customer impact

Describe issues in the way the customer is affected instead of the internal cause. “Customers are unable to pay for goods” is better than “our payment gateway is down.” Inform the customer – let them know what is happening and what that means for them.

6. Suggest alternatives (if any)

Suggest and explain any alternative solutions or workarounds that customers can take advantage of while you’re trying to fix the downtimes and resolve the issue. This shows that while you’re sincerely working on the problem, you’re also lending a helping hand to your customers by solving their existing crisis.

7. No blame-game; take full responsibility

You’re still responsible for your customer’s experience even if the fault is with a third-party system you use (and sometimes you can even solve problems outside your domain). Build their confidence: Let them know the situation is being taken seriously and actively worked on so they can safely do other work in the meantime.

8. Provide important context

Mentioning a third party can be useful information if it gives your customers a better picture of what’s happening and how that will affect them. For example, “We’re in contact with our payment gateway, and once we know more from them, we’ll update you here.”

9. Write for your audience, not yourself

Provide as much detail as will be helpful — but no more. Use language that your customer would use while explaining the outage. ‘DNS zone server authorization’ issue means nothing to your customer; instead use something on the line of “Mobile App Log-in issues”or “You will be unable to log in to your mobile app due to …”.

10. Never over-promise

It can be so tempting to say “we should be up in five minutes,” but outages can develop so quickly that it’s better to reserve timelines for when your technical team has triple-confirmed it.

11. Make it interesting

As long as you’ve got honest, clear communication covered, a little smiley emoji or GIF sharing can help you connect with your customers. Add some human touch and some personality – don’t turn into a corporate robot.

12. Always follow up

Even if you don’t have new information to share, consistently updating your messages helps those impacted know that you’re still working on it and that they haven’t been forgotten.

Pro tip: You’ll also want to keep your status page on separate infrastructure to minimize the risk of an incident taking down your service and status page at the same time.

Incident communication is a skill that can be improved with practice. While communication is key, your game-plan for reacting to major incident management makes all the difference in minimizing the impact of the incident. Implementing alert management systems with integrated monitoring tools and AIOps can help IT agents proactively handle major incidents and free up IT operations to pursue root cause analysis and higher quality fixes. A centralized IT Ops provides IT teams with all the visibility into modern and legacy IT systems, helps break down silos, accelerates response times, minimizes downtime, and further impact on business reputation.

There is scope to learn from every single incident, and these learnings can be applied to your incident prevention strategy, thereby avoiding a similar incident from occurring again. Genuine communication followed by swift action is always the thumb rule when it comes to converting your downtime into a happy customer experience.

For further information on how Freshservice can help you build more automated, efficient, and intelligent IT service and operations management, get in touch with our IT experts and change the course of your ITSM strategy today.

Author: Shamita Sharma

Design credits: Sharmila Prabhakaran