Thursday, April 26, 2018

One Root Cause

One Root Cause

Post by CatalinaNJB

There can only be one root cause when analysing an occurrence for the root cause. The objective with a root cause analysis is to identify the time and location when a different direction would have generated another outcome. At that time of intercepting time and location, or the fork in the road, a risk assessment is applied to the decision in form of a checklist risk assessment tool, or in the form of a crew-experience risk assessment tool. A risk assessment in the form of a predetermined checklist decision maker, or crew-experience risk assessment tool could have produced two different outcomes. Not only is the fork in the road a point when a decision needs be made of what action to take but is also a time to make a decision of what risk-assessment tool should be applied. 

The root cause is what supports a system in operations
What if there could be more than one root cause. In a different scenario and if several root causes were identified, this would require operators to develop corrective action plan (CAP) for each one of the root causes and to implement multiple CAPs, for all root causes. 

Since there are several root causes, the weight of each corrective action plan would be applied indiscriminately equal to each one. If the weights of each CAP are assigned different weight, then the one with most weight becomes the primary root cause and the other are secondary root causes. Still, with this scenario we are back to one root cause.

"So, what is actually a root cause?"

The root cause of a nonconformity or an undesirable event, is the course of action applied at a decision point within a process at a “fork in the road”. In other words, at the point when there is a time or location to make a decision (which there will be). That decision is based on data, information, knowledge and comprehension of how interacting systems may shape the future. A root cause is not just something to consider when things go wrong but is something to consider when things are going right. There are no reasons to wait for something to go wrong before assigning a root cause. In an SMS world root causes are identified proactively. The question then becomes how to identify a root cause before an event has occurred. It’s just as simple as identifying the root cause after an event. The root cause is based on the safety risk level in the risk assessment. Prior to changes or implementation of new processes a risk assessment should be conducted. In this risk assessment the risk is accepted with the known hazards or accepted with mitigation of these hazards. Simple example of mitigation is that an airplane is de-iced prior to departure when icing conditions exists. 

A risk analysis sets the bar and draw the line in the sand
Going a step further with the de-icing scenario, it is already known what the root cause is if the aircraft departs in icing condition and crash-lands in the trees. The root cause is not that the airplane was not de-iced, but that the de-icing system is broken down. This is simply known by the fact that flight crews operating in icing conditions have knowledge of the effect of ice on critical surfaces, but elected not to apply the mitigation of de-icing. 

The flight crew did not comprehend the de-icing system. There was no comprehension of how other systems interact with other systems. (If it was, it is assumed that the flight crew would not depart in icing conditions without mitigation). There are several systems included in a de-icing system, being environmental factors, human factors, organizational factors, supervision factors, technical factors and performance factors. The process flow of system comprehension is derived from data (collected by hazard reports), information (data is turned into information), knowledge (absorbed information) and comprehension (interacting systems) When comprehension is missing the system is faulty, or data is not analyzed, system comprehension is faulty. This faulty system comprehension does not rest with the flight crew but with the operator. 

It is the organization that conduct the risk assessment and what mitigation to apply for operations of the de-icing system. This risk assessment includes environmental factors. Is the de-icing equipment suitable for that type of aircraft? A spray bottle is no suitable to de-ice a large passenger jet. Human factors are assessed of what impact on the deicing system the flight scheduling might have. Organizational factors may have overlooked holdover times weather information availability. Supervision factors may have expected the flight crew to “remember” to de-ice with another million tasks to take care of on this flight. There might be other technical factors causing a de-icing unit to malfunction, or the operator accepted standard performance data for takeoff. All these factors are affecting the total de-icing system, which is the system that makes the airplane fly after takeoff. 

With all this information, the root cause becomes predictive in scope that there is a high probability and above an acceptable risk level of an aircraft incident while operating in icing condition if the operator overlooked or disregarded one or more of these affecting factors. The root cause to prevent an aircraft crash in icing condition is an organizational system that had a catastrophic failure. This is simply true since an operator under these conditions did not comprehend the complete de-icing system…which makes it only one root cause. 


Sunday, April 8, 2018

The Concept Of SMS

The Concept Of SMS

Post by CatalinaNJB

The first thing when giving Safety Management System (SMS) taring to personnel, is to provide training in the concept of SMS. It is crucial to success of a newly implemented SMS that the concept of SMS is introduced at all levels in the organization prior to any other SMS training takes place. This includes management and the CEO in addition to all other personnel. The CEO, President or Accountable Executives are just as new to SMS as the airport manager, pilots, mechanics and other personnel in the organization. That a person ranks higher in an organizational hierarchy does not imply that there is a greater knowledge of SMS, or that this person has a comprehension of the SMS.  It could be an intimidating task for a brand-new SMS Manager approach the CEO and suggest SMS training. It could also be an assumption that since the AE has final authority over SMS financial and human resources they are already SMS experts. However, when the AE accepted the AE position and appointed an SMS Manager, the SMS Manager became the only SMS expert within that organization with accountability to the SMS program itself to train the AE from the ground up. 

Compliance with regulations, standards and policies is not a safety guarantee    
There are two primary concepts to the SMS. The first concept of an SMS is for an abstract idea to produce an abstract result. This is applied when developing SMS manuals and processes for regulatory compliance. Regulatory compliance is a static-state of the operations and therefore abstract in nature.

The second concept of SMS is for an abstract idea to produce a tangible result. This combination in itself is a direct conflict with normal operations, where an abstract input, or existing in thought or as an idea but not having a physical or concrete existence, is expected to only produce an abstract output. [E.g. Brainstorming sessions are not expected to produce real results but are expected to produce safety plans for implementation.] An Enterprise could easily fall into a trap where the only purpose of their SMS becomes to produce a paper-trail, or the collection of data, as evidence of safe operations. It becomes evident in the Regulator’s validation of an SMS that there are distinct differences between abstract compliance and tangible compliance, by their design and demonstration validation.

Since hazards are opinion based, hazards become the abstract concept of SMS, while the task of initiating hazard reporting becomes the tangible output. Incidents and accidents cannot be reported until after the fact and are not staged for the purpose of reporting. On the other hand, real hazards may be staged for the purpose of reporting within a supervised environment. During a training session multiple hazards could be introduced to the group for hazard reporting. Each person may place different weight on the same hazard based on learning, expectation, experience and opinions and therefore abstract conditions. One person may report all conditions as hazards, while others may report none or just a few. Within an effective SMS, opinions of hazards are transformed into an output of hazard reports. Continuous safety improvements and proactive safety measures are dependant on data received by hazard reporting. When the concept of SMS is defined beyond the hazard reporting, SMS becomes reactive in concept.

In the eye of the beholder a sunset is a hazard.
Incidents and accidents cannot be predicted, but the intent of hazard reporting is to prevent accidents. Some years ago, the root cause of a 747-airliner accident was an event that had occurred twenty-two years earlier. During these years the airplane had been in and out of the shop many times, but nobody reported what they had observed. Hazard reporting was not an acceptable approach and by reporting a hazard caused by a faulty repair and questioning workmanship could be disrespectful to the mechanics. Over several years a discoloring on the airframe was observed, but there was no toolbox for personnel available to report this concern. A hazard report of these observations would have required at least one person to review the hazard and sign off on a safety risk analysis. Investigation of a hazard report may have guided personnel to inspect a damaged bulkhead.

The old saying is that “selling is not telling” also goes for the concept of SMS that “teaching is not telling”. Training in the concept is to introduce SMS as a toolbox full of tools to use for continuous safety improvements. Tools available to all personnel are sorted by their roles, responsibilities and accountability. The role, responsibility and accountability of the AE is to ensure that each person feels included in the SMS processes, that they take ownership of their hazard report and that they are following their own report from beginning to end. The end result of a hazard report may discover options that could never be imagined.


Line-Item Audits

  Line-Item Audits By OffRoadPilots A irports and airlines are required to conduct a triennial audit of the entire quality assurance program...