Friday, May 29, 2020
Failure To Comply
When the airplane crashed the probable cause was determined to be flight crew’s preoccupation with matters unrelated to cockpit duties. About sixty years later the probable cause in an airplane crash was the pilot’s failure to maintain airspeed and correct pitch attitude. Over a period of 60 year the probable cause was turned upside down, or 180 degrees and had gone from tasks performed to tasks that was not performed. In the 50’s they got the correct probable cause, while today the cause is assigned to an event that didn’t take place. The difference is that it is impossible to fail to comply with a task since one task or another is always performed. In addition, it becomes impossible to develop corrective action plans to a task that did not occur.
Airplane crashes happens because of human behavior and not because of human error, or failure to comply with an arbitrary defined task. There is a reason for the Safety Management System to consider human factors, organizational factors, supervision factors and environmental factors in a root cause analysis. There is a reason for airports to train their airside personnel in human and organizational factors prior to being assigned airside tasks. There is a reason that some airports apply crew resources management training to airside personnel just as an airline apply these principles to their flight crew. The reason is that it is human nature is to take the path of least resistance, which includes preoccupation with trivial tasks.
The Safety Management System is a wonderful process control tool applied to operations. When processes are applied to a certificate it is operational control. When applied to job performance it is monitoring and oversight. A flight crew certificate is managed by operational control, while airport personnel are managed by performance oversight. There is an option for operational control of an airport certificate by implementing the airport zoning regulations.
It is impossible to run an effective Safety Management System without a Statistical Process Control analysis. Without process control any corrective action plans are short-term and the issue will repeat itself over and over again. A downward trend in a bar-chart or a pie-chart is not a guarantee of an in-control process. These are trending charts displaying an upward or downward trend, or the size of each piece in the pie-chart. Definition of these trends are a “good trend” or a “bad trend”, which are emotional definitions and not derived from data. Trends are not good or bad, they are just trends. Simplified, an upward trend of incidents may be “bad trend”, while it could also be a “good trend” if compared to a baseline. If an airline quadrupled their fleet and cycles, while the incidents doubled, it is a “good trend”. The question is not if operations is in a “good” or “bad” state, but if processes are operational acceptable. Either as an airport or airline, an SPC analysis control chart paints you a quality assurance pictures of your operational processes. The question then becomes how to make incremental improvements to lower the upper control limit.
Airlines are promoting themselves as the safest mode of transportation. If this is the case, why does the Global Aviation Industry, being Airlines or Airports, need a Safety Management System (SMS) today, when they were safe yesterday without an SMS? Air travel per flight may have become a safer mode of transportation today than decades ago, but the question to ask is if their operational processes are in-control with a lower upper control limit today than they were decades ago.
SMS processes are analyzed with respect to human factors, organizational factors, supervision factors and environmental factors. A root cause analysis places a weight-factor on all of these factors and the highest factor is the root cause. When investigating an accident for a probable cause one of these factors will stand out as the root cause.
In the examples below an aircraft type was randomly selected for SPC analysis of accidents between 1985 and 2019. This analysis included all global reported accidents and operators. The next step was to select one operator and analyse the processes for that specific operator. The result shows that airline travel processes are not safer today than what it was in 1985.
The first control chart shows operational control that is out-of-control. The upper control limit is 19.5, or that operators of this type of aircraft accepts a process with 19.5 accidents per year.
The control chart below of one selected operator shows an in-control-process. This operator accepts a process with 4.3 accidents per year.
Let’s assume for a moment that aviation safety has improved over the years and that the time frame between 1985 and 2019 is a bit far expanded. The next step is to analyze the same scenarios, but between 2010 and 2019.
The below control chart shows that operational processes have become less safe than in prior years.
When narrowing the time frame to 10 years, the processes produced a result with 22.7 acceptable accidents per year.
In addition, since this is an in-control-process, as opposed to an out-of-control process between 1985-2019, this process systematically accepts 22.7 accident per year.
For the operator, their operational processes produced an improved result, with acceptable processes of 3.4 accidents per year, which is down from 4.3.
When analyzing quality assurance of processes, this specific scenario produced a result that aviation processes are not safer today than what they were in 1985. That aviation is the safest mode of transportation could be an illusion. The beauty of a Safety Management System is that it will capture processes that are out-of-control and in-control processes accepting an unacceptable level of accidents. With an SPC process analysis incremental improvements can be made to human factors, organizational factors, supervision factors and environmental factors. It appears that the aviation industry in the 50’s, identifying flight crew’s preoccupation with matters unrelated to cockpit duties as a probable cause had a better grip on safety processes than they do today.
Monday, May 18, 2020
The Red Car
Hazard identification is the foundation of a healthy Safety Management System. Events and occurrences are the consequences of hazards and a simple task to identify. In the old days of aviation safety incident and accidents were defined as pilot error. Without any further analysis, pilot error became the standard solution to past occurrences. After a major occurrences new regulation were implemented, technical standards were changed, and new equipment were installed. Still, after decades with new and improved changes, accidents still happened. As an attempt to overshadow the inherent hazards of flying accidents were defined as meaningless and safety defined as common sense. Hazards were trivialized and flying was promoted as the safest mode of transportation. After thousand of hours of accident investigations hazards were brushed aside as an insignificant element of safety since safety was common sense and accidents meaningless.
It’s not always the change, but the process change itself that is opposed.
With the implementation of the Safety Management System (SMS), the aviation industry was required to actively identify hazards, implement a hazard registry and analyze hazards affecting their operations. This approach was new to the industry and rejected with the explanation that hazard identifications was a part of the pilot’s or airport personnel daily task and their duty to avoid. In addition, the lack of hazard reports was a sign of complete safety in an operational environment without any hazards. In their own mind they had become as safe as possible without the need for improvements.
One day, when you bought a new red car, you noticed how many other red cars on the road. Not were the other cars the same colors as your, but they were also the same make and model. It was not until you became aware of your own make and model that you noticed this. How often did you not drive down the same road for several year, but then one day you noticed a new house. In your own mind, the house was brand new. However, after a short review, you realized that it was always there, except you had not noticed it before.
|All hazards identification is as distracting as using the smartphone while taxiing|
Hazard identification operates with the same principles. Unless they are actively identified, they will not be observed. A hazard is not only the airside vehicle that out of nowhere runs across the taxiway in front of you, but it is also the vehicle that waits for you to taxi or enter the taxiway behind you. Hazards are everywhere, but when the same hazard is observed regularly the tendency is to eliminate this as a hazard since it has become a part of normal operations. Some operators, being airlines or airports, may demand that all hazards are reported. However, the answer to hazard management is not as simple as to report everything. Reporting all hazards in itself could distract a pilot’s attention of priority tasks and be a contributing cause of an incident. For the airside vehicle operator, identifying all hazards could be a contributing factor for a runway incursion. Hazard management is hard work and extremely complex.
Hazards are an inherent risk of aviation for both airlines and airport operators. That a hazard repeat itself regularly and often, does not eliminate it as a hazard, but it becomes a common cause variation of hazard management. For an airline, the constant airside vehicle operations is a hazard to their operations. On the other hand, for an airport operator, the constant taxiing of airplanes is a hazard to their operations. At some point in time, these two hazards are literally on collision course. Even though the primary purpose of an airport is for aircraft operations, does not give an airline the hazard priority. Both airplanes and vehicles are of the same hazard priority, while they are operating under different rules. That an airplane has the right-of-way, while a vehicle is required to yield does not imply that the vehicle is the sole hazard.
Both airlines and airports have access to a statistical process control tool to identify the effectiveness of their hazard reporting culture, or if their hazard reporting system is in control.
In the control chart below the hazard reporting culture is in-control. Statistically, the result conforms to the process, adjusting the upper and lower control limits.
A shown in the chart below, if there were 10 times more hazards reported, the process is still an in-control reporting culture.
There are several data to be extracted from these control chart, but one fundamental observation is that the process conforms to its own environment to maintain an in-control process. I.e. human behavior conforms to expectations or reporting all or reporting none. In an organization where zero hazards are reported, human behavior conforms to that expectation. With a hiring spree and several personnel beginning at the same time, the same process may show an out-of-control process, since these new eyes are observing hazards without biased, or without prior exposure to the hazards.
Without the comprehension of both airline and airport operations, this control chart may cause incorrect mitigation of hazards. Since there are inherent risks involved in aviation, it is the special cause variations of hazards that must be reported. That there are new personnel involved is not an indication of additional hazards, but an indication that operational management did not have in place a hazard reporting system.
A hazard reporting system is when there are defined parameters of what is expected to be reported. For an airline this could be for the flight crew to report wildlife hazards while taxiing straight on Taxiway A, or for the airport operator the task for airside personnel could be to report wildlife hazards on approach to RWY 27 during a specified time. When the parameters are established it becomes possible to capture special cause variations, populate the hazard registry, conduct a root-cause analysis and implement a corrective action plan. Operational hazard management must live by the principle of “The Red Car” and define hazard parameters.
Monday, May 4, 2020
When the CAP is too complex for the regulator to understand they will dump it, reject it and without any attempt to analyze it, trash it. A complex CAP is nothing more than a reflection of publicly available guidance material issued by the regulator. This guidance material comes in the form of an Advisory Circular (AC).
|Guidance material is communication|
Each ICAO State may have different objectives, but their common goal is to ensure a level of safety in aviation that the flying public will accept. One goal a regulator publishes is to provide a safe and secure transportation system moves people and goods across the world, without loss of life, injury or damage to property. This is a goal of nice, positive and carefully selected words, but it is also an unattainable goal. In an environment of moving parts, equipment and people, damages are inevitable. A utopia of safety only exists in a regulatory and static environment. When a goal is utopia, safety is status quo where there is no room for incremental safety improvements. Since there are zero process that exists for an operation to ensure no damages, the regulator must exercise their opinions to enforce subjective compliance. If this subjective compliance is not adhered to, they take certificate actions against an aviation document. In a world where no damages are acceptable, the regulator cannot issue one single operations certificate. In a world where no damages are acceptable, it would be foolish by an operator to implement a new process without first the regulator designing and approve the process with their corporate seal. When a corporate seal is attached, the regulator has a tool to micromanage an operator, without operational responsibility. When the regulator applies an inspector’s opinions as regulatory compliance, their view is backwards looking where new systems are incompatible and an obstruction to their opinion.
Internal, or external audit findings can be at a system level or at a process level. System level findings identify both the system and the specific technical regulation that failed, and process level findings identify the process that was not functioning. To develop an effective CAP, an operator and more important, the regulator must understand the nature of the system or process deficiency which led to the finding. A finding must clearly identify which system or process allowed the non-compliance to occur. Without this clarification a corrective action plan cannot be developed.
A system may be without aim or directions for the untrained eye.
A system level finding is a finding of a process without oversight. Some of the system findings may be related to safety management system, quality assurance program, operational control system or a training program. An operational control system is applicable to an aviation document in flight operations. An airport aviation document is the airport certificate, which is issued to the airport parcel itself. An operational control tool for an airport certificate is the airport zoning regulations.
A process level finding is a finding where at least one component of a system generated an undesired outcome. A process level finding is an operational task of any system, except for the oversight system of affected process. Without oversight, or a Daily Rundown Quality Control, a process, or how things are done, are continuing to generate undesirable outcomes.
When a corrective action plan is developed, it is as effective as the operational comprehension level of the person implementing the plan. An Accountable Executive may fully comprehend the CAP, wile an inspector of the regulatory body oversight may not. It is normal for an inspector, who is not involved in the daily operations, to be at a level below comprehension of the plan. This is the exact reason why an Advisory Circular so beautifully directed their regulatory oversight inspectors to only assess the process used to come up with the CAP and not the CAP itself.
There are four levels to comprehension of a system. The first level is data, second level is information, third level is knowledge and the fourth level is comprehension. Data is collected by several means and methods. This data is then formatted and analyzed into sounds, letters or images to provide information, which again is turned into knowledge for a person to absorbed. The absorbed knowledge turns into comprehension of one system and how multiple systems interacts. It is unreasonable and unjust to expect that a regulatory oversight inspector comprehends the operational systems of airlines and airports.
A short-term corrective action plan is to immediately design and implement the plan. This immediate plan could be as simple as schedule training to be completed within 30 days. A long-term corrective action plan is a change of policy or a process change to design a plan to be implemented within a reasonable timeframe. A long-term winter operations CAP might take a year to be implemented, while a short term could be to clear the snow that day. Without defined timelines the long-term CAP does not exist, no matter how well the plan is written.
Facts give you directions.
A root cause analysis is fundamental to the design of a corrective action plan. Questions to ask when developing a CAP is to ask the 5-W’s and How; What, when, where, why, who and how.
The What question is to establish the facts. The When question is to establish a timeline. The Where question is to establish a location. The Why question is to populate the events as defined in the What question. The Who question is to define a position within the organization as defined in the Where question. The How question is to answer the events as defined in the Why question. When asked correctly, the How question takes you backwards in the process to the Fork In The Road where a different decision would have lead down a different path. This does not ensure that an incident would not have happened if this path was taken. All it does is to take a different path than the path that lead to an incident.
The 5-Why is a recognized root cause analysis. However, if the Why question is asked incorrectly the root cause statement becomes an incorrect answer. The Why question must be asked how it relates to the How question.
Another element to be analyzed within a root cause analysis are the four causal factors, or factors that affected the root cause statement. Depending on organizational operations and policies, these factors may be expanded to include other and specific operational factors. The four are the Human Factors, Organizational Factors, Supervision Factors and Environmental Factors. When analysed in a root cause analysis each factor is assigned a weight-factor in a matrix of the 5-W’s and How. The factor with the highs weight factor then becomes the determining, and priority factor in the root cause analysis.
When applying this comprehensive approach to the CAP and root cause analysis it should be expected that the process is too complex for someone who are not daily involved in operations. Additional supplementary information of the CAP could be to design a flowchart of how each item in the system affects other items with an expected outcome. This design must be simple and directed to specifics of the Fork In The Road where multiple options are available. When submitting a CAP to the regulatory oversight body, being the regulator or Accountable Executive, it is vital for operational success that reasoning for the CAP is supported by data.
You Are The President of Your Amazing SMS By Catalina9 T he Safety Management System (SMS) of today has still several hurtles to overcome...
Concierge SMS: Contracted out Safety Management Systems S o you are required to have a Safety Management System, SMS, by regul...
Safety through Control An Event A true event from the Civil Aviation Daily Occurrence Reporting System, (CADORS), in Cana...
Why SMS? By CatalinaMJB A Blog Analysis Of Aviation SMS This same q uestion is being asked by o...