Incident escalations and the priority matrix
Incident escalations are mechanisms that help to resolve incidents on time. Escalations use the priority of the incident that has been derived using the priority matrix. In ITSM there are two types of incident escalation:
Functional Escalation
Functional escalation is where an incident is reassigned to another support group, because the current group working on the incident do not have the necessary skills required to resolve the incident. Some organizations choose to reassign incidents to a more experienced support group after a predefined time interval passes, in accordance with the service level. The time interval typically varies according to the incident priority from the priority matrix, the interval is shorter for incidents of a higher priority.
Hierarchical Escalation
Hierarchical escalation is when staff alert a higher level of authority about an incident, usually their manager and the service owner for the service affected by the incident. The trigger for when to use hierarchical escalation will depend on the priority from the priority matrix and the service level for the incident. If the service level is in danger of being breached, then the manager and service owner should be informed so that they can help to preserve achievement of the service level target for this priority of incident, and take pro-active steps to maintain customer satisfaction. Deriving the correct priority from the matrix is an important step, as this helps to avoid making unnecessary hierarchical escalations. This is particularly the case for high priority incidents, as many organizations want to alert management and service owners soon after they have been identified using the priority matrix.
For example, an organization might have a service level to respond to a P1 incident within 30 minutes, but if not resolved escalate to management at the 20 minute mark. This would ensure that management are aware of the priority 1 issue before the service level is breached. In contrast, for a P2 incident with a 4 hour fix time the escalation might not occur until after 3 hours. A P3 incident with a 1 day fix time might not get escalated until the service level is breached, and a P4 and P5 may never have an escalation to management.
Where hierarchical escalations are in place, it is important to include testing the escalations as part of the scenario testing for the priority matrix. There should also be regular reviews of the escalations that have taken place, including support staff and management, to identify of there are any issues with the priority matrix or associated priority guidance that need to be addressed.
Using a priority matrix for incidents
Incident management is the most common process where a priority matrix is used. Just about every organization does incident management. Even organizations that don’t have a service desk still have to manage incidents. Designing and using a priority matrix to determine the priority of all incidents that are reported by users is essential for building and maintaining customer satisfaction. Moist of the time your support teams will have more open incidents that they have the capacity to handle. A priority matrix will help them to sequence their activities, working on the most important incidents first. For managing incidents, the priority matrix will use an assessment of how the incident is affecting users, both in terms of impact and urgency, to come up with the priority. Using the priority matrix in this way will ensure that the incidents affecting the business take priority, removing any guess work from support staff, and help to avoid them ‘cherry picking’ the tasks that they’d prefer to do first.
Most toolsets support the automatic allocation of priority to an incident by using an in-built priority. This can help to speed up the incident management process, although it can still be a good idea for the service desk staff to review the priority that has been automatically assigned using the matrix as they may have incorrectly input the impact and urgency.
Some toolsets allow for particular users, such as Directors of the business and critical users, to be identified as Very Important Persons (VIP). When an incident is received for these users, the toolset can automatically increase the urgency and/or impact for use in the priority matrix, enabling the allocation of higher than usual priorities for these incidents.
Toolsets also have features that can assess incident impact based on factors including the criticality of particular systems or services, times of day, and groups of users, which are then factored into the lookup of the priority matrix to determine the appropriate incident priority. The toolset will record these factors against the incident, which can be used to support the priority derived from the matrix in the event of any subsequent dispute.
It is a good idea to develop a procedure for the dispute of incident priorities. Whilst a priority matrix will simplify the assignment of incident priorities, the resulting priority is still dependent on information gathered from users and monitoring systems, and user expectations can vary. Hence there will be occasions when the priority matrix gives the incident a different priority to the one expected by the user. When this happens, the user must be able to dispute the priority that has been assigned from using the matrix, with the ability for the service desk to amend the priority if the dispute is upheld.
It must also be possible to change the priority of an incident during its lifecyle. For many incidents, the full impact is not known when first reported by a single user. The incident is given a low priority, as the lookup of the priority matrix uses a low impact. As more users report the same or similar symptoms, the impact increases, and the priority matrix would provide a higher priority. For this type of incident, where the known impact increases over time, it is important to continue to use the priority matrix to determine the new priority. Service desks can react to pressure from users to increase priorities without using the matrix. This can result in inconsistencies, and encourage unhelpful user behaviours.
Using a priority matrix for major incidents
A major incident is an incident with an extreme adverse impact on the business. This could be from excessive disruption to the services, or risk of subsequent disruption from events such as a virus attack. It is standard practice to have a separate variant of the incident management process for these major incidents, which recognizes the necessary urgency to resolve the incident. Your priority matrix must be able to clearly identify which incidents are major incidents, based on the impact and urgency. Major incidents by definition have the highest priority. It is important to verify through scenario testing that your priority matrix can clearly identify which incidents are major and which are just high priority.
Which priorities are defined as major incidents will depend on how many different priorities are included in your priority matrix. If you only have three priorities, then major incidents will always be P1. However, With just three levels of priority you risk invoking your major incident process on a regular basis. It is much better to have a greater number of priorities in your priority matrix, reserving P1 for genuine major incidents that have an extreme adverse impact on the business. The different levels of Impact and Urgency in your priority matrix need to be carefully designed to come out with a P1 for all genuine major incidents. This requires careful analysis of your services, including assessment of their true criticality to the business, and also an understanding of how users utilize those services.