Rethinking Root Cause Analysis
Annual Perspective 2016
Introduction
Root cause analysis (RCA) is a systematic process to analyze adverse events and near misses. Initially developed for use in high-risk industries such as nuclear power and aviation, the technique was adapted for health care in the late 1990s during the early days of the patient safety movement. RCA takes a structured, systems-oriented approach to identify what happened (the course of events), why an incident happened (the root cause, or causes), and how to prevent it from recurring in the future (corrective actions). Often, multiple root causes and corrective actions are identified. Accrediting bodies and some state regulatory agencies mandate RCAs after sentinel events and when significant patient harm occurs.
Although the technique of RCA seems conceptually sound, over the past decade experts began to question its effectiveness in health care. In a 2008 commentary, Wu and colleagues enumerated some of the flaws in RCAs: the efficacy of corrective actions identified varies; implementation of the corrective actions is often incomplete; and, at the end of the RCA process (which often involves significant time and resources), most organizations are unable to measure the true impact on future risk reduction. Indeed, one early study that described the RCA process within the Veterans Affairs health system did not assess the impact of corrective actions on outcomes, although it did highlight the value of the technique for organizational learning. Similarly, while the face validity of RCA remains high, other studies have failed to demonstrate that the corrective actions drawn from RCAs have prevented future harm.
Root Cause Analysis: Ongoing Need for Improvement
Notwithstanding these critiques, the manner in which most health care organizations conduct the RCA process today has changed little since the 2008 commentary. RCA remains time and resource intensive, requires organizational commitment to take action beyond the RCA itself in order to be effective, results in multiple corrective actions that vary in strength and success of implementation, and often has an unknown impact on the risk of future harms.
In a widely cited 2016 commentary, Peerally and colleagues described eight problems with RCA as currently applied in health care. The authors explained that current application of RCA in health care inherently conflicts with our deepening understanding of systems thinking, which recognizes the multilayered complexity of virtually all health care processes and seeks to solve latent (hidden or unrecognized) systems issues.
Problems With Root Cause Analysis in Health Care | |
---|---|
1. The name "root cause analysis | Promotes linear thinking toward a single root cause when most adverse events are complex and have multiple interacting parts and potential intervention points. Tools also tend to focus on a linear or temporal view of an event, rather than a systems view of events. |
2. Investigations of questionable quality | Teams may not have in-depth training in accident investigation and human factors; clinicians may be unwilling or unable to provide relevant data to the investigation; patients and families often are not included in the investigative process. |
3. "Political hijack" | Investigations are usually conducted under significant time pressure, lack independence from the organization where the events occurred, and are subject to hindsight bias. These factors may significantly limit the scope of the investigation, the recommendations, and the report. |
4. Poor design and implementation of risk controls | Tendency to focus on weak solutions such as reminders, education, and training over stronger solutions addressing flawed technology or flawed systems design; lack of evidence for effectiveness of solutions; insufficient follow-up on implementation of solutions. |
5. Feedback does not support learning | Learning is supported when outcomes of investigation are shared and the recommended solutions are "salient and actionable." This type of feedback is often missing in health care settings. |
6. Focus on single incidents and institutions | Incidents tend to be investigated in isolation—single events within single institutions. This limits opportunity for understanding recurrence vulnerability and may lead to focusing resources on preventing very rare events over solving broader systems issues that have greater potential to prevent harm. Systematic ways of aggregating root causes are lacking. |
7. Confusion about blame | The balance between individual and organizational responsibility is complex. While most accidents are the result of systems defects, serious individual transgressions also occur and need to be addressed. Taking an overly algorithmic approach to just culture can obscure complicated relationships between individual action and organizational defects. |
8. Problem of many hands | Many actors, both within and outside the health care organization, may be implicated. RCA-derived risk controls tend to focus on solutions that are within the internal control of the organization or team and do not assign responsibility to broader problems outside of their control (e.g., equipment design or medication labeling). |
Focusing more specifically on problems with RCA tools and techniques, Card described "the problem with '5 whys.'" He argues that the approach of asking "why" multiple times—a technique drawn from the Toyota Production System—is oversimplified and may inappropriately lead to the identification of a single root cause, possibly at the very end of the error chain. When this happens, the RCA team may miss more upstream causes and their associated corrective actions or solutions. He suggested that while the "5 whys" may prove useful in an educational setting, they should not be used for health care RCAs.
Dekker has also been critical of the safety movement's reliance on procedural solutions. He highlighted how hindsight bias can subvert efforts to understand causality in serious accidents. Hindsight bias occurs when people investigating a situation already know the outcome. It tends to cause humans to simplify complex accident trajectories, identify decisions that may have been ambiguous (at best) in real time as clearly wrong, and attribute blame when the actions of individuals are not blameworthy.
Improving the RCA Process
In response to these problems with the RCA process, the National Patient Safety Foundation (NPSF) released a report entitled RCA2: Improving Root Cause Analyses and Actions to Prevent Harm in 2015. To begin, NPSF advised renaming the process "Root Cause Analysis and Action," hence RCA2. This name emphasizes that the outcomes of the RCA—the corrective actions and resultant risk mitigation—are just as important, if not more important, than the process itself. In addition, NPSF outlined nine formal recommendations.
RCA2: Improving Root Cause Analyses and Actions to Prevent Harm: 9 Recommendations From NPSF | |
---|---|
1. Leadership | Leadership (e.g., CEO, board of directors) should be actively involved in the root cause analysis and action (RCA2) process. This should be accomplished by supporting the process, approving and periodically reviewing the status of actions, understanding what a thorough RCA2 report should include, and acting when reviews do not meet minimum requirements. |
2. Reevaluate the process regularly | Leadership should review the RCA2 process at least annually for effectiveness. |
3. Determine which events should not go through RCA | Blameworthy events that are not appropriate for RCA2 review should be defined. |
4. Prioritize which events to reviews | Facilities should use a transparent, formal, and explicit risk-based prioritization system to identify adverse events, close calls, and system vulnerabilities requiring RCA2 review. |
5. Expedite the RCA process | An RCA2 review should be started within 72 hours of recognizing that a review is needed. |
6. Train a designated team | RCA2 teams should be composed of four to six people. The team should include process experts as well as other individuals drawn from all levels of the organization, and inclusion of a patient representative unrelated to the event should be considered. Teams should not include individuals who were involved in the event or close call being reviewed, but those individuals should be interviewed for information. |
7. Provide time to conduct the investigation | Time should be provided during the normal work shift for staff to serve on an RCA2 team, including attending meetings, researching, and conducting interviews. |
8. Use appropriate tools and identify viable corrective actions | RCA2 tools (e.g., interviewing techniques, Flow Diagramming, Cause and Effect Diagramming, Five Rules of Causation, Action Hierarchy, Process/Outcome Measures) should be used by teams to assist in the investigation process and the identification of strong and intermediate strength corrective actions. |
9. Provide feedback | Feedback should be provided to staff involved in the event as well as to patients and/or their family members regarding the findings of the RCA2 process. |
Similar to the NPSF recommendations, Peerally and colleagues advocated for moving RCA from a "procedural ritual" to a process that has a meaningful effect on outcomes. They suggest that incident investigators should be trained in human factors, analytical methods, and relevant theories; patients and families be meaningfully engaged in the process; institutions develop clarity around issues of blame; and opportunities for aggregated analysis be created at multiple levels (e.g., department, institution, system, state, and national). Lastly and most importantly, they argue that RCA is fundamentally limited by its very nature as a retrospective process. The use of prospective risk identification processes, such as failure mode and effect analysis (FMEA), may be essential for addressing top patient safety risks and may lead to improved risk mitigation. Some argue that, like RCA, FMEA is resource and time intensive and that its impact on safety outcomes in health care is largely unproven. However, a recent study suggested that a simplified approach to FMEA may improve the process.
In addition to RCA, there are other review techniques that can be applied effectively in certain situations, often with less time and effort. One 2016 study pilot study tested a concise incident analysis approach, drawn from other abbreviated incident investigation methods such as the Canadian Incident Analysis Framework, the WHO High 5s program, and the Learning from Defects Tool. Those participating in the pilot felt that the tool was useful, concise, and simple and they would continue to use it. An earlier study described the development and adoption of a rapid approach to RCAs, referred to as "SWARMing," derived from swarm intelligence. Such an approach was associated with increased incident reporting as well as a decline in mortality. These promising new approaches merit further testing.
Summary
In the early days of the patient safety movement, RCA seemed like an essential technique to learn from errors and develop robust prevention strategies. The literature published in 2016 highlights ongoing problems with RCA in health care, most importantly emphasizing the difficulty with measuring the impact of the process on reducing future risk. Several studies have suggested opportunities for improvement, including redesigning the process and tools used, ensuring the active involvement of organizational leadership, focusing on stronger corrective actions, measuring implementation and impact on outcomes, and considering the use of abbreviated incident review approaches when appropriate. The push by an influential safety organization (NPSF) to rename the RCA is evidence of the growing imperative to reinvent this important safety process based on accumulated experience and data.