Each week I will be providing a Root Cause Analysis "Tip of the Week".
This week, for the inaugural blog post, I'd like to de-mystify one of the biggest misconceptions in all of root cause analysis.
Here are some of the decades-old definitions of what constitutes a “root cause:”
“A root cause is a factor that caused a nonconformance and should be permanently eliminated through process improvement.” (ASQ)
“A root cause is an underlying or fundamental reason for any failure of safety observance, accident or issues related to health, environment, quality, reliability and production etc.” (Safeopedia)
“The most basic cause (or causes) that can reasonably be identified that management has control to fix and, when fixed, will prevent (or significantly reduce the likelihood of) the problem’s recurrence.” (TapRoot)
“The causal factor(s) that, if corrected, would prevent recurrence of the occurrence.” (DOE-STD-1197-2011 – Occurrence Reporting & Causal Analysis)
What I have found after a few decades of leading or facilitating root cause analyses in the commercial nuclear industry and the National Laboratories, is that these definitions are one of the main reasons why organizations have trouble with preventing the recurrence of significant issues. Other reasons include the lack of knowledge and experience of the personnel conducting the analyses, and the lack of willingness of the organization to actually find the root causes (rather than settling for the low hanging fruit so they can move on). These definitions, taken literally, ask for the root causes of the event being investigated. Therefore, as soon as we identify the root causes for the event, we are confident that our corrective actions can prevent that event from recurring. And all that is entirely true.
However (and this is a big however), true root causes are actually deeper than that. The actual, most deepest-seated root causes will have caused, not only the one event we are investigating, but also many other events in the past, present and future.
Let's take the following Industrial Safety example; a universal language we all speak in any industry. A person fell off a ladder and broke their arm at a highly regulated facility. The ensuing root cause finds that they were not wearing fall protection and had inappropriately stepped on the top rung of the ladder. Many organizations would declare victory and make sure that no one else will ever step on the top rung of a ladder and get injured because they were not wearing fall protection above 6 feet.
A deeper investigation would find that the activity was managed as "skill of the craft" and did not require a work order, a hazards analysis, or a pre-job briefing to discuss working above 6 feet or proper ladder usage. The real root cause? This organization did not have a graded-approach to their Work Control process. That meant that every work activity was controlled by a full suite of requirements, or none at all. So as a work-around, supervisors would take the path of least resistance for seemingly low risk activities and assign work as skill of the craft to avoid the extra paper work. It turns out that there were over 100 such work activities in the field, involving other hazards besides working at heights. All those other activities (i.e. working under heavy loads, working with rotating equipment, working in confined spaces, working in areas with high combustible loadings, etc.), were just as susceptible to accidents and were also affected by the lack of proper work controls. Addressing ladder usage and fall protection requirements would only have a limited effect on all those other activities, which would remain at risk and could result in future events. Bottom line: personnel remained at risk until the Work Control Program was revised to establish a minimum level of control that included an evaluation of potential safety hazards and a pre-job briefing to discuss expected conditions and what personnel needed to work safely.
By all accounts, and by the accepted definition of "root cause," the original analysis did identify and address the root causes for the original event. However, it fell way short of actually getting to the fundamental issues that would continue to plague the facility for years to come.
Here's a better definition of "root cause:"
Deep-seated causes for an event or condition, which, if corrected or eliminated, would preclude repetition of, not only the event or condition being analyzed, but also many other conditions affecting past, present and future performance.
Tip of the Week: Check the definition in your Corrective Action or Root Cause Programs and update it to this definition if needed; it will prompt a deeper investigation the next time your organization has an event requiring a root cause analysis.
If you would like additional information on dramatically improving the effectiveness of your root cause analysis, visit us at: https://dle-services.com/bluedragon