Problem Manager

The problem manager is a key role within a company’s IT Service Management (ITSM) organization. Their primary task is to prevent incidents from happening as well as minimizing the impact of incidents that can’t be prevented. The problem manager identifies, prioritizes and assigns responsibility for problems and then manages them through the entire process to solution. A key part of the role is creating and managing a knowledge base to maintain the information about known errors and workarounds for the service desk and self-service portals to use.

Key Activities

There is often confusion about a problem manager’s day-to-day activities. The problem manager is not there to solve problems. The problem manager is responsible for analyzing incident trends, identifying repeat incidents and determining where the application of problem-solving efforts will reap the biggest benefits for the organization. A problem manager works throughout the entire organization, leveraging other resources for knowledge, skills and assistance in the diagnostic process.

 

Working Environment

The problem manager is responsible for assembling a big-picture perspective, gathering data from different sources, interpreting meaning, projecting possible outcomes, assessing impact, evaluating alternatives, managing a portfolio and recommending priorities for problem resolution. To do this effectively, he or she must collaborate with many other service management roles. Two of the roles with which he or she works closest are the incident management team, analyzing incident records to identify repeat incidents, and the change management team as they implement permanent fixes to problems. Problem managers must also understand the business’ goals by working with business leaders to understand the impact of problems, so they can correctly prioritize problems for resolution.

Value of a Problem Manager role in the organization

The problem manager is responsible for assembling a big-picture perspective, gathering data from different sources, interpreting meaning, projecting possible outcomes, assessing impact, evaluating alternatives, managing a portfolio and recommending priorities for problem resolution. To do this effectively, he or she must collaborate with many other service management roles. Two of the roles with which he or she works closest are the incident management team, analyzing incident records to identify repeat incidents, and the change management team as they implement permanent fixes to problems. Problem managers must also understand the business’ goals by working with business leaders to understand the impact of problems, so they can correctly prioritize problems for resolution.

Improved quality of services and solutions

Repeat incidents cause frustration for staff and customers and they ultimately affect the bottom line of the business, by increasing costs and/or decreasing customer satisfaction and retention. These impacts can be minimized when an organization focuses on eliminating repeat incidents and when the problem manager leads this effort. If an organization decides to implement a proactive problem management strategy, then repeat incidents can be avoided and potential negative effects can be eliminated. The results of these efforts are improved quality and reliability of the services and solutions for users.

Manage organizational risks

The problem manager has a critical responsibility for managing and mitigating organizational risks. Major incidents can have a huge impact on the reputation and financial performance of an organization while minor incidents can significantly decrease the organization’s productivity. It is not possible to avoid all incidents. The problem manager is responsible for understanding the potential technical issues the company may encounter, assessing likelihood and impact to understand the overall risk exposure and to develop an appropriate set of remediation plans.

More efficient allocation of resources to fix problems

Problem solving is executed by assembling emergent teams that respond to problems, which may require gathering people from throughout the organization, often from both business and technical disciplines. A good problem manager will know who is likely to have the information needed to identify correctly the root cause of a problem. Understanding the skill sets of subject-matter experts, being able to assess their workloads and allocating problem-solving activities appropriately ensure efficient allocation of the organization’s most valuable resources. 

Clear accountability for Continuous-Improvement activities

Problem management and continuous improvement are closely related. Removing repeat incidents will contribute to the continuous improvement of IT services. Problem management is the only process that provides specific tools for continuous improvement by identifying actual or potential failures and transforming these into opportunities for improvement. 

Goals for the Problem Manager

A good problem manager will develop a wide breadth of skills, knowledge and experience. It is tempting for organizations to assign problem managers to a variety of activities that don’t match their core job roles. To avoid this and to maximize value from the company’s problem-management investments, the problem manager should be allowed to focus on pursuing three goals:

Scaling the Problem Manager role

Problem management is a function that every organization needs. Unfortunately, not every organization can afford a full-time problem management team or even a dedicated individual. Problem manager roles are appropriate for companies of all sizes and can be adapted to the company’s scale and needs.

Part-time responsibility

Many organizations will not have the luxury of a dedicated problem manager. In this case, it is still essential that one person is responsible for problem management. This could be anyone in the IT department, with one notable exception – your problem manager should not be responsible for incident management as well. 

There is a conflict of interest between incident and problem management. The incident manager’s task is to return the customer to his or her work as quickly as possible, while the problem manager must gather information to determine the root cause of the incident. In a small organization, a dedicated problem manager is not needed, but responsibility for problem management should be assigned to someone.

One person

Many medium-sized organizations choose to assign a single person as a problem manager, but this can create challenges if multiple incidents require his or her attention. This arrangement can be successful if the problem manager has sufficient authority to ask for the assistance of subject-matter experts if needed. The problem manager must also be able to delegate problems to team members in other parts of the organization. A problem manager is not there to solve the problems; he or she prioritizes, delegates and manages the lifecycle of the problem.

Small team

In a larger organization, a dedicated problem management team may be able to share the workload of analysis and queue management. This team should have the skills to facilitate problem-solving sessions and good critical-thinking skills to enable them to lead emerging teams in problem-solving exercises.

Large, distributed problem management function

This is probably the most common, and most successful, means of structuring problem management in a medium to large enterprise. In this model, one problem manager leads a team of trained problem solvers and facilitators who may represent both business and IT units.

Training a diverse group from IT and the business units in sound problem-solving methodologies will result in a pool of people available during a crisis. These problem solvers can also be used to oversee parts of the problem queue, requesting other subject-matter experts as required.

Problem management skills

Problem managers need a diverse toolbox of skills, knowledge, experiences and relationships to be successful. Training a range of people throughout the organization in the core skills of problem solving and giving them the opportunity to use these skills and share them with other members of their teams will strengthen the value of the organization’s problem-solving capabilities.

Consistent processes and tools

Most ITSM tools have well-developed problem-management capabilities that can support your team throughout the problem-solving process. Many of these tools advance in steps from industry standard methodologies directly into the toolset, prompting the steps the staff should be taking and when they should be using them.

Technical knowledge

A problem manager must understand the basics of the technical aspects of problems to serve as the translator between the business and IT, but he or she should not be expected to be a technical expert of every problem he or she is managing. While the problem manager must have some level of technical knowledge, it is more important he or she knows where to access that information. He or she must understand the business and be able to identify the subject matter experts in each area of the business and the IT department. 

Problem-solving skills

Critical-thinking and root-cause-analysis skills are the most important skillsets for the problem manager. The ability to be able to step back, look at a problem logically, apply intuition and know who must be engaged to resolve the issue are essential skills of a problem manager. 

Business knowledge

To assess and prioritize the work on problems, the problem manager must have a good understanding of the business. He or she must know the business’ goals, who are the customers and where business value is delivered to manage a problem queue efficiently.

Data analysis and statistics

A good problem manager will have an analytical mind. He or she must spend a considerable amount of effort understanding incidents, reviewing reports and analyzing data to understand the cause and effect of different situations. Statistical methods are often used in problem management to assess risk and evaluate alternative solutions.

Aligning Problem Managers within your organization structure

One of the biggest benefits of effective problem management is the potential to contribute to continuous-improvement activities in the organization. Problem managers may work in many places within your organizational hierarchy, but they are most effective when they can directly influence the decision-making process. Executive sponsorship of the problem management role is critical. Without the sponsorship of management and delegated authority, the problem manager is unlikely to be successful.

Problem Managers within the ITSM organization

Problem management is a key process in the ITIL® framework and it isn’t surprising most organizations place their problem managers within their service management organization. This gives them strong influence over service-management-governance processes and fosters deep collaboration with incident and change management functions.

Problem managers placed within business functions

Problem managers don’t have to be placed within the IT organization. In organizations where business functions play an active role in technology planning and prioritization, problem managers may be a part of business functions. This structure is also helpful if the scope of problem-management activities includes process, supplier and people-related issues (not just technical problems). 

Problem managers outside of IT 

Critical-thinking and root-cause-analysis skills are the most important skillsets for the problem manager. The ability to be able to step back, look at a problem logically, apply intuition and know who must be engaged to resolve the issue are essential skills of a problem manager.

Top attributes (skills and traits) to consider when hiring a Problem Manager

Not everyone has the skillset and mindset to be a good problem solver. Problem-solving takes a specific mentality and an ability to think critically about the issues at hand. A good problem solver is a good thinker. He or she does not become emotionally involved in the issues he or she is addressing. He or she will use a mixture of intuition and logic to determine a solution. Here are some of the top traits to consider when hiring a problem manager:

1. Curiosity about how processes, systems, etc. work

Problem management is a key process in the ITIL® framework and it isn’t surprising most organizations place their problem managers within their service management organization. This gives them strong influence over service-management-governance processes and fosters deep collaboration with incident and change management functions.

2. Ability to handle ambiguity

Problems are not always clear. Many red herrings will appear during a successful resolution process. A good problem manager will be able to recognize these and carefully assess them before dedicating time to research possible dead-ends. He or she understands many situations are not always what they seem and will always seek confirmation before investing in a course of action. 

3. Experience in evaluating trade-offs

Often, there is no perfect solution to a problem, fixing one issue may cause another. A problem manager must be able to assess the best solution for the business and understand the organization may need to accept the lesser of two evils. Being able to determine which is the lesser is the mark of a good problem solver.

4. An understanding of opportunity cost

Frequently, there will be more than one solution to a problem. An experienced problem manager will be able to assess which solution will provide the best opportunities for the organization, and what it lost by not selecting other options. The problem manager must be able to articulate the opportunity cost clearly to allow the business to select the best of multiple options.

5. Experience/training in risk-management techniques

Risk management is an important part of the problem manager’s role. He or she must be able to assess quickly the risks of taking no action or implementing any number of possible solutions. He or she must be able to identify the risks clearly and define the mitigating and corrective actions to be taken to minimize the risk to the business.

6. Data and statistical analysis experience

A problem manager will be spending much time looking at statistics, analyzing incident queues and repeat calls. There is much more to prioritizing the problem queue than the number of calls logged for one problem. The problem manager must look at the cost to the business of each outage, the disruption to the customer, the damage to reputation and the effects on financial performance. He or she will then have to weigh these statistics against the cost of providing a solution to the problem. 

Measuring Problem Manager Value and Performance (metrics)

Percentage of problems with an identified Root Cause

One of the basic statistics for assessing the efficiency of the problem-management capability in an organization is how many of the problems added to the queue have an identified root cause. Identifying the root cause does not mean it has been fixed, or even it will be fixed, but it is a starting point to assess the value of further work to determine the viability of creating a permanent fix for the problem.

Risk – Evaluation of Problems

Not every problem is likely to recur or to cause significant issues if it does. Assessing each identified problem for the risk it presents to the organization is one of the first steps in the prioritization process. A problem that has a minimal risk, reputationally or financially, will almost always be lower on the list of priorities than a problem with a higher risk.

Prioritization of fixes 

After the risk assessment has been completed on identified problems, this information can be used, with the estimated cost of providing a permanent fix, to prioritize the problem queue for resolution. This is a balancing act a good problem manager will do daily to utilize the available resources that provide the best value to the organization.

ROI of recommendations

Providing a long-term fix for a problem can be an expensive exercise. Calculating the return on investment for this work is a critical part of the problem manager’s role. He or she must be able to articulate clearly the benefits vs. the cost of each of his or her recommendation. Ultimately, the business will use this information to make decisions based on which fixes provide the best returns to the organization. 

Cycle time 

The interval of time from when an incident is first reported to the implementation of a long-term fix is a key efficiency measurement. A good problem manager will identify incidents quickly, assessing them and prioritizing them for action. The faster the problem-management process is executed, the less the risk exposure and potential impact to the organization.

Tools used by Problem Manager

Problem management is a data-intensive function. Technology can improve the effectiveness and efficiency of problem-management processes and help problem managers direct organizational resources to achieve maximum benefit. Some of the key tools problem managers use include:

Data analysis and reporting tools

There are a huge variety of data-analysis and reporting tools to help the problem manager assess the problem and incident queues, understand the impact to business operations and justify investments in fix recommendations. Many of these tools are already in use throughout the organization. Problem managers should seek to leverage whatever data is available.

Statistical analysis tools

Statistical methods will be useful to analyze both structured and unstructured information needed in problem management. Problem analysis is a complicated task and there are a huge number of variables that will impact the value of the gathered information.

Incident management system

Your incident-management system, which is likely part of your overall ITSM solution, will provide the most expansive dataset to support problem management. Artificial-Intelligence capabilities available in many systems can provide further assistance to the problem manager to manage large volumes of incident data.ncii

Configuration management database

An accurate CMDB is an essential tool for the problem manager. He or she must refer to the CMDB to understand dependencies and assess the risk of current problems and the potential impacts of any identified fixes. Root-cause-analysis processes that probe cause-and-effect relationships are often heavily dependent on data from the CMDB.

Monitoring

Automatic monitoring software will provide important information for the problem manager. Monitoring alerts from these systems can allow him or her to detect problems before they impact the business. This proactive side of problem management is more difficult to implement but can enable proactive problem identification and mitigation.

Known-issue database

A problem with an identified cause is now a known error. A comprehensive database of known errors must be maintained and with the current workarounds that can be applied to return customers to their work when the issues are reported. Known-issue databases are often exposed to end users through self-service portals.

What happens if there is no problem manager in a company?

Someone will always perform problem management. Even in an organization either not large or mature enough to have a dedicated problem manager, there will always be someone who will address and manage critical problems on behalf of the organization. Often, this is a service desk manager, business leader or executive. When no one has delegated authority for problem management, however, only the most critical problems will receive attention and the opportunity to avoid incidents will be lost.

When no one is responsible for problem management, only the most obvious and painful problems are addressed and problems that affect the most vocal of your customers are prioritized – the squeaky wheel receives the grease. Problem management operated in this manner will produce haphazard results at best, and, at worst, it will have a significant negative effect on business productivity and customer satisfaction.

The organization will face increased operational risk as it is simply not aware of the risks it is facing daily. When the underlying cause of repeat incidents is not identified and resolved, repeat incidents will clog the incident queue. This causes stress to the service desk staff and reduces productivity. 

The organization will fight fires, but preventative activities are likely to be overlooked. Resolving incidents as they are reported, but without sufficient time to investigate the underlying cause, not only leads to lost opportunities for service improvement but can also lead to staff burnout. Without problem- and the associated knowledge-management capabilities, service-desk technicians are often starting from scratch every time an incident is encountered. Addressing underlying causes is the key to service improvement.

Problem managers are critical to the continuous-improvement process, monitoring the overall environment for issues and problems that are impacting business operations and putting the company at risk and incurring unnecessary costs. Whether the organization is large or small, it is essential to have clear accountability for problem management. This ensures a thorough diagnosis of root cause and efficient allocation of resources to fix problems will generate the most value.

Other Related Resources