Saturday, August 20, 2022

The Practical Applications of SMS

The Practical Applications of SMS 

By OffRoadPilots

The Safety Management System (SMS) for aviation is a practical system to lead personnel, manage equipment and validate operational design for improved performance above the safety risk level bar. The bar is always set at an acceptable risk level. This level may vary with size and complexity of an organization, prior experience and accepted practices, risk analysis and justification, or simply by arbitrary lines drawn-in-the-sand. There are no rules within a safety management system of what reasoning to use to establish the level of the safety risk bar. One reason could be that we always did it this way, or another reason could be to do what is least stressful and creating less of a workload.  

There are regulatory requirements for a safety management system to include a safety policy, a process for setting goals and attainment of those goals, a process for identifying hazards and managing the associated risks, a process for training personnel and that they are competent to perform duties, a process for reporting and analyzing hazards, incidents and accidents and for taking corrective actions,  a document containing all SMS processes and a process, a quality assurance program, a process for conducting periodic audits and any additional requirements for the SMS to function as intended. Regulatory requirements are applied to a static operation, where there is no aircraft movement or airside operations. Certificates are issues based on expected performance and maintained based on past performance. A practical application of the SMS comes into play when operations are alive. 

If the SMS seems to be overwhelming and practically unreachable objectives and goals, it is a simple solution to move to the Some-Day-Island. On the Some-Day-Island there is no accountability, no responsibility, the island is isolated, there is unlimited safety, it is a perfect place to make excuses and there are no reasons to get up and work with initiatives. Since the Some-Day-Island is located within a personality and not an actual geo-location, there are some traits that recognizes a person living on this island. They would paint a picture of an elephant in the room that is the cause for all their failures. They see all activities to make changes as useless and waist of time and that there is nothing else to learn, since they already know all that there is to know. A person living on the Some-Day-Island expect their organization, being airline or airport, to be in top-notch condition, but still slumbers on the island hoping that someone else, and often someone they don’t know, will take initiative to get them into the top-notch-condition. Living on the Some-Day-Island is a hazardous place, but it is also the place where everything is played safe.    


A misconception often assumed by the Accountable Executive (AE) is that the primary task of an SMS is to complete the check boxes for regulatory compliance. While it is true that the check boxes must be completed for tracking and data-points purposes, this task is incidental to operational tasks preceding the checkbox entries. The primary task is to complete a task that conforms to regulatory requirements. An effective SMS has established regulatory compliance for each individual task within the system. These tasks are monitored, checkboxes completed and by the end of the day their required daily quality control system is completed.


The practical application of an SMS are the acceptable practices. An acceptable practice may be a written procedure to adhere to, or an unwritten acceptable practice that has taken form over time as a reliable and practical process. Aerial tankers, waterbomber and forest fires air suppression are examples when unwritten and acceptable practices take over and applied within a safety management system. When SMS takes control, practical application of the SMS becomes paramount. An aerodrome means any area of land, water (including the frozen surface thereof) or other supporting surface used, designed, prepared, equipped or set apart for use either in whole or in part for the arrival, departure, movement or servicing of aircraft and includes any buildings, installations and equipment situated thereon or associated therewith. In layman’s term, an aerodrome is anywhere an aircraft, including drones, operates.

When a waterbomber scoops up water from a lake, that portion of the lake becomes an aerodrome, and the regulations are applicable to operations. Regulations are not just applicable to the flight crew and air navigation, but also to the aerodrome itself. Aerodrome regulations applies in respect of all aerodromes and includes any surface of land or water where an aircraft operates. The operator of an aerodrome, other than a water aerodrome, shall install red flags or red cones along the boundary of an unserviceable movement area.  By this definition a waterbomber scooping up water may use any portion of a lake to pick up water. On the other hand, a heliport servicing a forest fire area, needs to be cleared and delineated.

The airspace above a forest fire is automatically closed and becomes restricted airspace. Only aircraft authorized into the airspace are allowed to enter. Before taking off from, landing at or otherwise operating an aircraft at an aerodrome, the pilot-in-command of the aircraft shall be satisfied that there is no likelihood of collision with another aircraft or a vehicle, and the aerodrome is suitable for the intended operation. When operating out of a lake, there is little or no data available to assess if the aerodrome is suitable. Suitability of the lake becomes an operational, or mental risk analysis task. Prior to the first pickup, the pilot may circle the lake to collect data for the risk analysis. From above any reefs and low grounds are easily identifiable and the task becomes to remember their location to align their approach and runway between the reefs. The second task of the aerodrome risk analysis is to assess both approach and departure ends for snags, hills, or other obstacles. This analysis may be based on prior experience at that particular lake or based on experience from other lakes with similar length and surrounding terrain. A third task of the aerodrome risk analysis is to assess mobile objects, boats, or recreational use of the lake. 


With several waterbombers picking up at the same lake, the pilot in command of each aircraft establishes a procedure to pick up at the same area, climbing turn in the same direction after takeoff, and to arrive on final for the next pickup in the same order as their previous arrival. When there are operational changes, the first aircraft to change becomes the lead aircraft and for the others to follow. This is a practical use of the SMS and were established decades before SMS became regulatory required. When looking at SMS as a practical application, the implementation of SMS did not change any operational processes. Any changes were unnecessary self-imposed by operators. 


When operating in an area where there are few and far between acceptable lakes for pickups, air tankers are used as the primary tool for forest fires suppression. An air tanker loads up fire retardant at the airport and heads for the fire. These turnarounds could be long and time consuming, but when using large aircraft, such as the DC-10, one load of fire retardant covers a large area. Continuous flying the DC-10 at low altitude is a variation of what the aircraft originally was designed for. The aircraft was designed for long haul transportation at high altitudes. During normal operations there is minimal control movements after takeoff and when established oncourse. Flying the DC-10 as an airtanker is a special cause variation and requires a root cause analysis of the hazard. A hazard does not imply that the operations is unsafe or dangerous, but that the operations is different than what an airline DC-10 Captain is trained for. In addition, the continuous strain of low-level turbulence and maneuvering is also different than the original certification. In 2002 two airtankers, of personal interest, crashed during firefighting operations due to material fatigue and the strain this type of operations puts on the aircraft, and pilots. When practical application of the SMS is applied, special cause variations are analyzed, assessed, and classified by its safety critical area and safety critical function. A practical application of the SMS does not eliminate accidents since it is impossible to predict an accident until the last minute. What a practical application of the SMS does, is to accept that there are inherent risks in aviation, learn from accidents, and apply past experience of how operations went well to build on that knowledge for continuous safety improvements.  





Saturday, August 6, 2022

When Safety Gets Involved

 When Safety Gets Involved

By OffRoadPilots

Safety has been involved in aviation since the first flight in 1903 and since then safety result randomly, but without directions, were able to improve. Airlines did what they could to improve safety but were unsuccessful in total elimination of accidents. Over time, as aircraft became larger and more of them at the airports, airside accidents became systemic errors. When operators become overly focused on safety, but they do not know what to do with it, then safety has become its worst enemy. No one wants to expose themselves to danger, but the real hazard when overestimating risks is overcontrolling processes to remain safe. 

Overcontrolling safety is a common reaction to opinion-based root cause analyses. When a root cause is based on preliminary assumptions, there is a strong temptation to overcontrol safety to ensure, in their own mind, that everything possible was done for immediate safety improvements. After a severe aircraft occurrence everyone wants answers, but impatience and instant gratification to find out why an aircraft crashed is a hazard to aviation safety. When the accident investigation process is not understood, management and other positions in an organization who has been assigned safety oversight may demand solutions right now, without knowing all the facts. In support of their demand for a solution, they could make irrational statements with reference to safety, and place blame and responsibilities on lower-level personnel for lack of safety management. A simple solution to protect a high-level position is to play the safety-card. The safety card is played when safety becomes the driving force of operations without considering facts. In addition, the safety card is often played when safety is not defined, measured or when operational pressure is applied from a third party or social media. 


There are no reasons for an immediate finding, cause of accident, or root cause after a single engine aircraft crash, or a large airliner crash. One crash does not render the aviation industry unsafe or demands major changes to operations. If an aircraft crash due to an unreported and unidentified wind share with an extreme change in wind velocity, or an aircraft crash due to contaminated surfaces, there is no justification to cease all aircraft operations, since the aviation industry has already established a track record for being safe. That an investigation is ongoing, and the cause of the crash and root cause are still to be determined, does not imply that an airline must remain idle until an investigation report is published. However, an enterprise is compelled by their accountability to the safety management system to conduct an internal analysis of human factors, organizational factors, supervision factors and environmental factors to determine the factor with highest probability impact on events leading up to the crash. An internal analysis, prior to the final accident report, is a probability analysis as opposed to a root cause analysis. 


There are multiple phases to an aircraft accident investigation. The most common phases are the field phase, the examination and analysis phase, and the report phase. 


In the field phase, an investigator in charge is appointed and an investigation team is formed. The nature of the occurrence determines the makeup of the investigation team, but it can comprise operations, equipment, maintenance, engineering, scientific, and human performance experts. The number of investigators needed to investigate depends on the nature of the event, severity and composition of parties involved. During the field phase the public is informed, the crash site is secured, and pictures or videos are taken of the wreckage and crash site. Overhead drones is a commonly used tool to document facts. Initially, witnesses, airport personnel, company personnel or government personnel are interviewed. Accepting to be interviewed is voluntarily, and information learned from interviews are not used for disciplinary actions against pilots, mechanics or other personnel involved. After the initial facts are documented, the wreckage is removed for further examination by the investigative authorities. The regulator does not investigate aircraft accidents. 

The examination and analysis phase is away from the accident site. This phase consists of examining the company, aircraft, flight crew, training records, maintenance records and safety management system records. SMS is relatively new in the aviation industry but becomes a vital part of an aircraft accident to analyze applied processes. Parts and components of the wreckage may be sent to a laboratory for analysis, such as material strength, metallurgical analyses and both destructive, and non-destructive testing. Any possibilities, but also unthinkable options are analyzed. The examination and analysis phase is an unbiased process without predetermined conclusions. This phase also consists of reading and analyze recorders and other data, create simulations, and reconstruct events, review autopsy and toxicology reports, conduct further interviews, determine the sequence of events, identify safety deficiencies, and update interested parties of progress in the ongoing investigation. If, at any stage of the investigation, the investigator identifies safety deficiencies, they may inform those who can address the problem right away. 

The final phase of an investigation is the report phase where the investigation report is drafted. Selected members of a committee review the draft report and may approve it, ask for amendments, or return it to the investigators for further work. A report may be rejected for any reason but may also approved for any reason. Once there is a consensus to the draft report, it is sent to designated reviewers on a confidential basis for comment. A designated reviewer may be any person at an air carrier, airport, corporation, manufacturer, or association, who, in the opinion of the review committee, will contribute to the completeness and accuracy of the report. After such review with comments, report is amended as required. After this review, the review committee now approve the report to be released to affected parties. For single engine aircraft crash, this reporting process may take 9-12 months to complete, while for a large airline crash it may take 3-5 years. Since a report is a final and conclusive report, any evidence and documents and records are destroyed. 


Overcontrolling safety, or when safety gets involved, is to fall into the instant gratification trap and conclude with a root cause before facts are known. Prior to SMS, the safety manager had all powers, and root cause statement that included the word “safety” was accepted as facts. With the implementation of a safety management system, safety was no longer verbal statements, but an intelligent system where process maturity was allowed prior to making changes, or control, specific item identified. 

A safety management system without statistical process control analysis capability (SPC), is still operating in the pre-SMS era. It is crucial for the validity of an SMS to understand the difference between a process that is in statistical control (stable) and a process that is out of control (unstable). In processes there are variations. A common cause variation is a variation in the process that is required for the process to function as designed, or to function within the laws of nature, or the laws of physics. A special cause variation is a variation introduced to a process that is not a required variation for the process to function as designed. The migratory bird season is a common cause variation and required for the process to work and causing more bird activities around airports in the spring and fall. A flat tire when driving to work is a special cause variation, since it is not a variation required for the process to travel to work.    

If this month's aviation incidents were higher than last month, a question to ask is what happened? This is a common question heard today in many organizations, but many do not know how to answer this. A major barrier to the use of control charts is that SMS enterprises do not understand the information contained in variation. When they understand this information, they will realize that the type of action required to reduce special cause variation is totally different from the type of action required to reduce common cause variation. Control charts also helps SMS enterprises to understand why costs decrease as quality improves, and that pointing faults and blame at personnel is totally wrong.


There are generally speaking two types of mistakes when looking at data. One mistake is to assume that a data point is due to a special cause when in fact it is due to common cause, and the second type of mistake is to assume that a data point is due to common cause when in fact it is due to special causes. There are different corrective action plans for a special cause variation and a common cause variation. A special cause variation needs to be removed, while a common cause variation is to be managed within a safety management system.   


When Safety Gets Involved, is when safety makes corrections, or eliminate a variation in a stable process, or when safety makes overcontrolling the only acceptable procedure. Simplified, when overcontrolling, or a desire for instant gratification in safety is happening, the next control point has moved, and will continue to move farther and farther away from the issue until there is a total and unexpended failure. It is crucial for the success of an SMS to know what battles to fight, but determining a root cause to a common cause variation is not one of them. 





Line-Item Audits

  Line-Item Audits By OffRoadPilots A irports and airlines are required to conduct a triennial audit of the entire quality assurance program...