ITIL problem management explained

Dive deep into IT problem management strategies that ensure smooth operations and minimize interruptions.

Definition
Process
Benefits
Best practices
Examples
Tools

Dec 06, 202312 MIN READ

IT incidents are inevitable; incident patterns are problematic. When your IT department keeps fighting the same fires, it’s time to reach out to the problem management team so they can resolve the issue for good.

Problem management is a set of processes used to predict and prevent future incidents, eliminate recurring incidents, and minimize the business impact of incidents that can’t be avoided. As a key component of IT service management (ITSM), problem management has specific objectives, processes, challenges, and ways of delivering value to the IT department and the organization overall.

Defining ITIL problem management

Problem management is a core process of Information Technology Infrastructure Library (ITIL), the most commonly used ITSM framework. According to the ITIL glossary, problem management is “the process responsible for managing the lifecycle of all problems” so IT organizations can “proactively prevent incidents from happening and minimize the impact of incidents that cannot be prevented.”

Although IT problem management is often confused with incident management, they are different (but complementary) functions. ITIL defines an incident as “an unplanned interruption to an IT service or reduction in the quality of an IT service,” while a problem is “a cause or potential cause of one or more incidents.”

Let’s understand the difference between problem management and incident management. To do this, it’s helpful to think of ITSM as a sickness. Incident management treats a symptom, while problem management diagnoses the underlying disease.

What is the main goal of problem management?

Prevention is the key objective of ITIL problem management. Problem managers aim to:

Prevent future incidents from happening
Prevent recurring incidents from happening again
Prevent avoidable business impacts caused by unavoidable incidents

To accomplish these goals, your problem management members (or problem managers) will analyze incident trends, monitoring data, and other relevant information to identify potential and recurring problems and determine their root causes. Then problem managers analyze potential solutions and document workarounds for service desk agents and self-service users to follow in the future.

Simply put, problem management isn’t about solving problems; it’s about ensuring IT teams have fewer problems to solve.

Types of problem management

There are two main types of IT problem management:

Proactive problem management: predicting future incidents and preventing them by eliminating the root cause.
Reactive problem management: reacting to recurring incidents by analyzing the root cause and providing a long-term fix.

Why is problem management important for businesses?

ITIL problem management is the key to continuous service and process improvements. Your problem management team continuously monitors the overall IT infrastructure for issues that impact business operations and drive unnecessary costs. They predict potential incidents so your IT department can proactively intervene, resulting in minimal to no downtime.

Problem managers also maintain your known-error database (KEDB) and document successful workarounds for commonly recurring incidents that can’t be prevented. This information enables service desk agents to work more efficiently because they’re not wasting time solving the same problem over and over, starting from scratch each time. It also enables you to automate services and build a self-service portal.

Problem management process and ITIL

Problem management is one of several key ITIL processes. Each process has its own goals and responsibilities, but they are all interlinked and dependent upon one another to function optimally. If one piece of your service management puzzle is missing, the rest are less effective at improving service delivery.

Other core ITIL processes include:

Incident management: Effective problem management requires close collaboration with incident management teams. Their data and insights help problem managers identify and prioritize recurring incidents. It often takes multiple incidents before problem management has enough data to analyze what is going wrong and figure out what steps can be taken to correct the situation.
Knowledge management: As its name suggests, knowledge management involves the creation of a robust knowledge base or repository of materials. Problem management contributes workarounds to the knowledge base, which incident managers and service desk agents can consult to resolve issues faster. AI-enabled ITSM solutions also leverage this information to power chatbots, self-service portals, and automated resolutions.
Change management: ITIL describes change management as “the process responsible for controlling the lifecycle of all changes, enabling beneficial changes to be made with minimum disruption to IT services.” When a change causes disruptions and/or downtime, it is analyzed under problem management processes. And when recurring incidents require problem management, the solution is applied via change management.
Service request management: ITIL refers to this function as the “request fulfillment process.” It is responsible for responding to all service requests, including tickets asking for new software or hardware, application access, or password resets. If a service request causes a disruption or leads to recurring incidents, problem management might need to get involved.
Configuration management: This is the process for creating a configuration management database (CMDB), which is “used to store configuration records throughout their lifecycle.” CMDBs are an important component of knowledge management. If certain configurations lead to repeat incidents, they might need to be analyzed by the problem management team.
Asset management: Defined by ITIL as “a generic activity or process responsible for tracking and reporting the value and ownership of assets throughout their lifecycle.” Asset monitoring solutions supply important data to the problem management team, alerting them to potential hardware or software failures that need to be proactively addressed.

Check out the Service Management Benchmark Report

GET THE REPORT

What is the ITIL problem management process?

No two IT problems are identical. Problem management addresses various incidents, leveraging different data sources, tools, and subject-matter experts to resolve each issue. But while the circumstances may vary, the ITIL problem management process typically follows the same workflow.

How does problem management work?

Problem detection

To predict future incidents and detect recurring incidents, your problem management team continuously gathers and monitors incident trends, hardware and software failures, and resource utilization and capacity issues.

An existing problem might be detected within an incident report or by users or service desk technicians. Ideally, problems are proactively detected by your ITSM software or machine monitoring tools. Proactive problem detection can go a long way in preventing service disruptions.

Problem logging (categorization and prioritization)

Once problems are detected, your problem management team must categorize, prioritize, and document them. If you have the right ITSM solution in place, it might automatically handle this part of the process.

Keeping a record of problems is important for future reference. Each problem record should include:

Problem type and category
Incident description
Associated incidents
Affected CIs (Configuration Items) from CMDB
User information
Status
Resolution
Closure

This information enables your team to tag known errors and manage them in a database. It also helps with prioritization, which is based on:

Impact: the number of users and CIs affected by the problem
Urgency: how quickly the resolution is needed

Prioritization involves assessing the problem’s impact and urgency to determine how fast a problem needs to be resolved.

Investigation and diagnosis

Next, problem managers will investigate the incident pattern to diagnose the root cause of the problem. They will often review the KEDB, looking for similar problems. They might also loop in relevant subject-matter experts who can help them better understand what’s going wrong and brainstorm possible resolutions.

Resolution

There are two possible ways to resolve each problem:

Problem control: Using root cause analysis, problem managers identify the underlying cause of the incident and convert it to a known error. Then workarounds can be created and documented to ensure the incident is handled most effectively in the future.
Error control: Problem managers find permanent solutions for known errors so they don’t recur in the future.

Once a solution is determined, problem managers can pass the problem and resolution along to your change management team, who are responsible for evaluating, planning, and executing the changes/solution.

Closure

At this point, the problem (and any related incidents) can be closed. This involves:

Verifying that any details entered during the logging and classification process are accurate
Ensuring the change management team has all the information they need to implement the solution
Reviewing the resolution of the problem and its impact on the business
Carry out a risk analysis to ensure the problem management process is carried out smoothly and is continually improved for the future

This review should be recorded and shared with relevant teams and individuals.

Roles and responsibilities for successful problem management

The problem manager is responsible for analyzing historical and real-time incident data, identifying repeat incidents, and determining which problem-solving efforts will reap the most significant benefits for the organization. The role requires:

Collaboration: Problem managers must work closely with other service management roles—especially incident management (which helps them identify repeat incidents) and change management (which helps them implement resolutions).
Coordination: Problem managers are responsible for coordinating the lifecycle of each problem, from diagnosis to resolution. This requires a strong awareness of the knowledge and skill sets available throughout the IT organization so problem managers know who can help solve each problem.
Communication: Problem managers are responsible for maintaining an up-to-date problem queue and informing stakeholders of progress.
Prioritization: Problem managers work with business leaders across the organization to ensure they understand the business goals and the impact of problems so they can correctly prioritize resolutions.
Documentation: Problem management helps to build the ITSM knowledge base, with known error records and workarounds for the service desk and self-service portals to use.

Looking to start IT problem management in your organization?

Start your free trial today

Benefits of problem management

Effective problem management enables IT teams to prevent some incidents and reduce the business impact of others, which:

Improves IT service design and delivery
Boosts service agent productivity
Shortens incident resolution time
Minimizes service interruptions and downtime
Saves time and money

Increases employee and customer satisfaction

Problem management isn’t just an ITSM box to check. Done right, this ITIL process enables continuous service and performance improvements, making your IT organization more agile and productive with each problem that gets solved. To get the most value from your efforts, keep these problem management best practices in mind:

Assign and empower dedicated problem managers: This individual or team (depending on the size of your IT organization) has clear roles and important responsibilities. Make sure they have the time, resources, and authority to execute the problem management process according to ITIL standards.
Put a communication strategy in place: Problem managers act as liaisons between incident managers and change managers. When a problem comes up, it is important to keep the lines of communication open between these teams and to ensure that affected end users receive regular updates about problem resolution status. (This is where automation with your service desk tool can come in handy.)
Make use of both proactive and reactive problem management: Understand the differences between the two methods of problem management and the scenarios in which they can apply.
Keep up with SLAs (service level agreements): Problem management has its own SLAs and ensures that you are able to meet these deadlines according to severity and urgency.
Check the KEDB: Assuming you already have a rich repository of known errors, refer to the KEDB for more swift problem resolution. If you don’t have this resource, start building it and keep it up to date.
Align problem management priorities with business goals: Not all problems require action, and not all actions should be given the same priority and attention. A problem manager with good analytical skills knows the difference and can ensure IT’s scarce resources are applied to the highest-value opportunities.
Don’t skip steps in the problem management process: The problem management workflow is a proven guide to quick and effective resolution, and every step is mission-critical.
Leverage ITSM software to align ITIL functions: Integrating problem management with other ITIL modules (such as change management and incident management) allows for information to be in sync and consistent. A robust ITSM platform aids with cross-function collaboration and enables service delivery automation.

Examples of successful problem management in action

Problem management is a wide-ranging function that boosts the effectiveness of other ITIL processes and IT functions. When you have the right people and technology in place to support problem management efforts, your incident management team has fewer incidents to address. Your service desk agents have the information they need to resolve tickets faster and can offload queries to a robust self-service portal that lets users solve their own problems. And your change management team knows exactly what changes they need to make to improve service delivery with minimal downtime.
Take Cater Care, for example. As Australia’s leading contract catering company with 207 locations, the business delivered savory food to customers but struggled to deliver satisfactory IT services to employees. Service desk agents spent most of their time on mundane, repetitive tasks—until they implemented Freshservice ITSM. Equipped with the right tools, incident and problem managers were able to identify and solve those redundant issues once and for all, automate workarounds, and improve processes and workflows.
Blair Logan, head of IT for CaterCare, explains, “The problem we were trying to overcome was the lack of coordination amongst the teams and changing the way support functions interact with the front lines of the business. Freshservice ticked all our boxes and helped us streamline our processes.”
Restaurant chain L’Osteria faced a similar issue. Its IT team supported 124 restaurants across eight countries, all of which faced the same recurring incidents, including Wi-Fi problems and machine failures.
“The IT team was receiving repetitive requests on recurring issues and our agents were spending too much time answering them,” says IT Manager Manfred Schneider. “This was a major source of inefficiency for L’Osteria.”
Using Freshservice for problem management, L’Osteria built a robust knowledge base with information about how to solve known issues, launched an AI-powered self-service portal, and automated recurring workflows. As a result, IT achieved a 100% customer satisfaction (CSAT) score and boosted service desk agent productivity.
Likewise, transportation software provider Trapeze built a Freshservice-powered knowledge base that has decreased ticket volume by 15% and raised their CSAT score to over 75%. And international advertising agency M&C Saatchi says Freshservice ITSM improved its incident, change, and problem management capabilities, and boosted self-service portal utilization by 300%.
The moral of the story: When IT teams get on the same page—or the same ITSM platform—problems get solved faster and KPIs improve quickly.

Evaluating IT problem management tools

Problem managers rely on a variety of technologies to gather and analyze data and manage their workflows, including:

Data analytics and reporting tools
Statistical analysis tools
Incident management and ticketing systems
Configuration management database
Machine monitoring software
Known-error databases
Knowledge management (knowledge base) systems

These functions can be performed by different solutions or a single problem management solution. Or, ideally, they can be handled by a robust ITSM tool that integrates problem management, incident management, change management, and other key ITIL processes into one platform.

Either way, a good problem management software solution is one that can:

Automate problem management workflows: Track and change problem resolution status, identify associated incidents, and facilitate interdepartmental communication about closure
Perform root cause analyses: Analyze the underlying cause of a problem and its impact, and track the resolution or workaround within the portal to monitor its progress
Create and maintain a KEDB: Document known errors and resolutions, link problems to existing incidents, changes or releases, and anticipate service disruptions
Prevent incidents with workarounds and solutions: Provide service agents with workarounds and contextual insights about problems, with attached links to references and resources—all from a single window
Build a knowledge base that powers self-service: Provide end users with resources they can use to solve their own problems, based on known-error data and documented workarounds
Leverage AI: Enable predictive analytics, intelligent recommendations, and self-service chatbots by leveraging the latest AI in ITSM capabilities

Simply put, the best problem management software helps problem managers do what they do best: prevent as many problems as possible and minimize the business impact of those that can’t be avoided. This way, IT teams can stop wasting time putting out the same old fires and spend more time delivering value and driving innovation.

More Resources

No-nonsense guide to ITSM

Learn more

Fundamentals of uninterrupted IT

Level up the workplace with automation and AI

Learn more

ITSM glossary

Frequently asked questions on customer experience

What does customer experience mean?

Customer experience is a customer’s perception of their interactions with your company.

What are the benefits of a good customer experience?

The benefits of good CX include:

Higher revenue
Increased competitive advantage
Better brand reputation
Increased customer retention and loyalty
Decreased customer churn

How can we measure customer experience?

Customer experience can be measured by tracking the following key metrics:

Net Promoter Score™ (NPS)
Customer Satisfaction Score (CSAT)
Customer Effort Score (CES)

How can we improve customer experience?

There are many ways to improve CX, such as:

Building a customer-centric team
Meeting or exceeding customer expectations
Making improvements based on feedback
Choosing a feature-rich CRM like Freshsales to help you deliver frictionless, personalized conversations and interactions
Leverage data to identify and address weak spots
Appoint a qualified CXM or CXO

Sign up for Freshservice today

Elevate your problem management with powerful ITSM software

Try it Free Request demo