Monday, September 20, 2021

SMS Authority

 SMS Authority

By Catalina9

A Safety Management System plays a role in the organizational charts for both airport and airlines, but without overriding any other regulatory requirements the SMS is an administrative tool rather than a safety improvement tool. Since the SMS being a businesslike approach to safety, poor decision makings are allowed, and losses are acceptable. In addition to other regulations, the regulator must verify that airports and airlines comply with all regulations and not just the SMS regulations. This could create conflicts between the SMS regulations and operational regulations. 

It is a lonely road for an AE to find hidden SMS facts.

The two avenues of a Safety Management System are the regulatory and operations avenues. The regulatory avenue includes oversight, policies, systems, research, development, design, compliance, project solutions leadership motivation, quality control, audits, and quality assurance. The operations side of the SMS are processes, procedures, implementation and maintenance, training, data collection, analyses, review, and communication. Oversight is by the Accountable Executive (AE) and operations is by the SMS Manager. 

The two regulatory requirements to act as the AE are that they have control of financial and human resources that are necessary operations. These requirements are different than roles and responsibilities of an AE, since they are only the authority to act as Accountable Executive. Their roles and responsibilities are defined in the regulations as to be accountable on behalf of an airport authority, a mayor, a city council, a corporation, a business, or a person for meeting the requirements of the regulations. Depending on size and complexity of an airport or airline, an Accountable Executive is responsible for between 250-500 regulations. This responsibility is much greater than the asserted responsibility over financial and human resources.  

 

The roles and responsibilities of an SMS Manager are operational in nature. Their responsibilities under the regulations are defined as being responsible for implementation of a reporting system to ensure the timely collection of information related to hazards, incidents and accidents that may adversely affect safety. Timely collection may be different today than yesterday and may look very different tomorrow. When SMS was first invented, timely delivery was by fax. If someone sends a fax today, their report might not arrive on the SMS Manager’s desk. 

 

Another responsibility is to identify hazards and carry out risk analyses of the hazards. This responsibility is so huge that it is almost impossible to comprehend. Identification of hazards are not defined in the regulations as an opinion, but actually of factual hazards. A hazard identified one day is still a hazard the next day. When hazards are identified an SMS Manager has a responsibility to investigate, analyze and identify the root cause of all hazards, incidents and accidents identified. The regulatory requirement is not to identify the root cause of selective hazards, but to identify the root cause of all hazards. 

 

An effective SMS needs a safety data system to be implemented by electronic or other means. This is another responsibility of the SMS Manager. When this requirement was first implemented a paperformat safety data system was acceptable, but as the SMS evolved it became unmanageable as a paperformat system and electronic databases were used. Over time this system also became obsolete since electronic spreadsheets could be manipulated or corrupted by adding or removing data. There are several SMS cloudbased services available, the comprehensive task is to select one that do not demand control over your Safety Management System. There are only a handful cloudbased data collection tools that let you maintain full control over your own SMS.  

 

This leads us to the next responsibility is that the SMS Manger implements a safety data system to monitor and analyze trends. Monitoring is to maintain regular surveillance over events, and to do this at uniform intervals. Monitoring events does do very little to improve safety. After data is collected it is turned into information to be absorbed by one, or all, of the five senses. When absorbed, information turns into knowledge, which is used to analyze for trends. When trends are known, the SMS Manager has a tool to comprehend interconnected links. This tool is also available to the SMS Manager as a tool to monitor and evaluate the results of corrective actions implemented from the analysis. 


Concerns of the aviation industry may vary with experience.

The most comprehensive responsibility that an SMS Manager has is to monitor the concerns of the civil aviation industry in respect of safety and their perceived effect on an airport or airline. There are several responsibilities applied to this regulatory requirement. The first task is to decide what to monitor, another task is to decide when to monitor, with a third task where to monitor, e.g. locally or globally, the next task is define in details why to monitor, in addition to the regulatory requirement, and who should monitor. Monitoring might not be done by the SMS Manager, but could be assigned to dispatch, flight following or airside maintainer. The final task is to decide how to monitor the aviation industry. 


Other responsibility an SMS Manger has is to determine the adequacy of the training required by the SMS Manager and for personnel assigned duties under the safety management system. A person with any responsibility for an aircraft operating airside at an airport or a person with airside responsibilities are personnel assigned duties under the SMS. This includes both the Accountable Executive and SMS Manger in addition to other workers with roles and responsibilities for the safe operations of an aircraft or airport. 

With all these SMS responsibilities both airline operations, or airside regulations will overrule SMS proactive actions. A requirement at an airport is to maintain obstacle free zones for approach surfaces and transitional surfaces. When an SMS identified that tall trees or construction cranes are almost penetrating these surfaces and should be removed as a precautionary action, the overall decisions in the past were that since these obstructions legally conform, they must not be removed or restricted. The same scenario could be applied to a damaged, but legally conforming engine, a stress-damaged wing that is legally conforming, or the tailstrike damage to Air China 601 accident. If SMS is given its intended regulatory powers by an airline or airport will be documented in how recovery in aviation after a pandemic is given accountability to legally conforming concerns. Both pilots and maintenance crew are experiencing the old effect of being “bushed”.  As an old bush-pilot, I've seen people get "bushed" living in the middle of nowhere for months and they would do unthinkable things. What the global aviation industry must comprehend is that pilots and mechanics are being “bushed” by quarantine and other enforced pandemic demands. Since the regulations is not broad enough to include, or cover this aspect of aviation safety, it becomes the responsibility of the airlines and airports to ensure that SMS is allowed to function as intended and capture every “almost” in aviation. 

 

Catalina9


Monday, September 6, 2021

Exposure

 Exposure

By Catalina9

Exposure in the Safety Management System is an integrated part of a risk assessment and risk analysis. A risk assessment involves several steps and forms the backbone of an overall risk oversight plan. Included in a risk assessment is one or several risk analyses to determine the defining characteristics of each hazard and to assign risk level scores based on the analysis. Key components of a risk analysis are likelihood, severity and exposure. Likelihood is a definition of times between intervals of an active hazard, severity is a defined outcome of the occurrence, and exposure is the variable, defined as common cause variation or special cause variation and a assigned a function, or weight score, between 0 to 1. If the exposure is zero, the hazard does not exist or has been eliminated. When the exposure is one, the impact of a hazard is inevitable.

When common cause variations are treated as special cause variations, the risk analysis has taken the wrong turn at the fork in the road. Common cause variations are integrated in a process, they are necessary for the process and the process would fail if one or more common cause variations were eliminated. An example of common cause variation is ice in clouds and thunderstorms. For ice to form on an aircraft in flight, the air must be cold and contain moisture. Icing conditions frequently occur when moist air is forced upward. As the air rises, it expands and cools. If the air cools to the saturation point, where the temperature equals the dew point, the moisture will condense into clouds or precipitation. For ice to form there must be clouds or precipitation and icing can be most intense near the cloud tops, where the amount of liquid water is often greatest. This part of the cloud has the greatest amount of lifting, cooling, and condensation. Encountering inflight icing in clouds is therefore a common cause variation, while non inflight icing in clouds is a special cause variation.

When applying exposure in a risk analysis the task is to analyse in 3D and measured
in time (hours-minutes-seconds), space (geographical location) and compass (direction). A 3D analysis is to analyse a moving object within the tube itself, rather than from behind, below, above, beside or in front of a moving object. A 3D analysis is the expected view as observed by the pilot at a specific moment in time, location, and direction.

Exposure paints a picture of the past to plan for the future.
The first step when analyzing exposure is to determine if the variation is a common cause variation or a special cause variation. When traveling to or from work, people conduct a mental exposure analysis by leaving at a certain time to avoid the heaviest traffic. In aviation common cause variation analyses are also conducted for arrivals at major airports or during special events. Comprehension of systems is therefore vital to correctly identify the true variation and develop the proper corrective action plan. If encountering inflight icing in clouds was assigned as a special cause variation with a root cause analysis, the analysis would be derailed from the beginning. The analysis could easily take a turn to explain that icing in clouds were not in the forecast. While this might be true, does not make it a special cause variation, since icing in clouds is to be expected anytime an aircraft is flying above freezing level. 

If the freezing level was lower than forecasted still makes it a common cause variation, since this is what the freezing levels do every day. The forecasted freezing level is nothing else but a risk assessed model of what altitude the level might be in the future. Pilots and dispatches often blindfolded accept icing and freezing level computer models, which then could be mistaken for a special cause variation. Level of exposure changes with time, location, and direction of an aircraft. An aircraft on the ground has a zero-exposure level to inflight icing. The exposure level begins when the reach the rotation speed. Inflight icing could be from ice accumulated on the ground and the exposure level for inflight icing is therefore 1,or 100% likelihood, or probability, that the ice will affect aircraft performance. 

One reason for ground de-icing and anti-icing is to reduce the exposure level of inflight icing to an acceptable level and defined as holdover time. Research of anti-ice fluids has determined that the fluid remains effective for a short period of time and when an aircraft is airborne prior to the time expires, the exposure probability, or likelihood, to inflight icing is inconceivable, or times between intervals are imaginary, theoretical, virtual, or fictional.

After it has been determined that a variation is a common cause, the next step is to analyse how the hazard could be exposed, or how the hazard could affect operations. If the hazard is icing in cloud, the analysis shows a likelihood of 1 that flight into known icing will expose the aircraft to inflight icing. The analysis is both a part of the pre-flight planning and inflight operational observations. The severity of icing is determined by several conditions, but for the purpose of icing when entering clouds at a flight level above freezing level, the likelihood of exposure is methodical, planned and dependable, without defining the operational system or processes involved. When analysing flight crew, aircraft and expected level of icing severity, available operational systems play a role. Encountering icing may vary from a level of informational, which is a severity level that is not compatible with another fact or claim of the hazard, to catastrophic which is a severity level where functions, movements, or operations cease to exist, or it could be any level between these two extreme severity levels. Exposure level in SMS is a pre-flight, or pre-task operational tool with actions defined in applicable safety cases.


Special cause variation Beatty NV 1981-03-18
The third step and an analysis of level of exposure to a special cause variation is a totally different approach, since a
special cause variation is unexpected, it is an abnormal condition and a variation that is irrelevant for the process to function as expected. A special cause variation could be a
malfunctioning ITT or Inter Turbine Temperature during takeoff. Special cause variations are excluded from pre-flight planning since they are items covered by other levels of protections. When using the ITT example above, aircraft engines are regularly inspected and found acceptable, or it is removed from the aircraft if unacceptable. When the pilot takes off, the engine is expected to perform as it should without malfunctioning. However, a principle in aviation is to expect the best but to be prepared for the worst. Preparing for the worst at every takeoff is not exposure to an engine failure or other system failures but is a part of an ongoing recurrent training program. Below is an example of how a malfunctioning ITT is identified in a control chart and when this is identified a root cause analysis must be performed. Normally the ITT is running 680°, but one day it was 681°.


This variation did not trigger an incident or ITT exceedance, but it is a variation that is not common within the system itself and must be investigated with a root cause analysis. Exposure levels triggers two actions: The first action is to prepare for common cause variations and the second action is to conduct a root cause analysis of a special cause variation.

Catalina9




Sunday, August 22, 2021

Scale Down for Compliance

Scale Down for Compliance

By Catalina9

Scale Down for Compliance

An airport operator has several responsibilities when it comes to the activation of an airport emergency plan, activities during the emergency and post emergency activities. Airport Emergency Plan compliance is a comprehensive task which at first glance seems impossible to comprehend and achieve.

Airport emergency planning is the process of preparing an airport to cope with an emergency occurring at the airport or in its vicinity. The object of the airport emergency planning is to minimize the effects of an emergency, particularly in respect of saving lives and maintaining aircraft operations. The airport emergency plan sets forth the procedures for coordinating the response of different airport agencies and other community agencies in the surrounding community that could be of assistance in responding to the emergency. The basic needs and concepts of emergency planning and exercises are command, communicate and coordinate.

An airport operator has a responsibility to identify organizations at the airport and community organizations that are capable of aiding during an emergency at the airport or in its vicinity. Telephone numbers and other contact information for each organization are listed in the airport emergency plan and the type of assistance each organization can provide is also listed.

An airport operator has a responsibility to identify any other resources available at the airport and in the surrounding communities for use during an emergency, or in recovery operations and provide their telephone numbers and other contact information.

An airport operator has a responsibility to describe lines of authority for each emergency and the relationships between the organizations and how interactions between these organizations are coordinated, and coordination within each of these organizations.

An airport operator has a responsibility to identify supervisors and describe the responsibilities for each emergency.

An airport operator has a responsibility to specify the positions occupied by airport personnel who will respond to an emergency and describe their specific emergency response duties.

An airport operator has a responsibility to identify the on-scene controller and describe the person’s emergency response duties.

An airport operator has a responsibility to provide authorization for a person to act as an on-scene controller or a supervisor if they are not airport personnel.

An airport operator has a responsibility to set out the criteria to be used for positioning the on-scene controller within visual range of an emergency scene.

An airport operator has a responsibility to set out the measures to be taken to make the on-scene controller easily identifiable at all times by all persons responding to an emergency.

An airport operator has a responsibility to describe the procedure for transferring control to the on-scene controller if initial on-scene control was assumed by a person from a responding organization, e.g. fire, ambulance or police.

An airport operator has a responsibility to describe any training and qualifications required for the on-scene controller and other airport personnel identified in the emergency plan.

An airport operator has a responsibility to describe the method for recording any training provided to the on-scene controller and airport personnel.

An airport operator has a responsibility to describe the communication procedures and specify the radio frequencies to be used to link the airport operator with the on- scene controller, and to link the airport operator with the providers of ground traffic control services and air traffic control services.

An airport operator has a responsibility to describe the communication procedures allowing the on-scene controller to communicate with the organizations identified in the emergency plan.

An airport operator has a responsibility to identify the alerting procedures that activate the emergency plan, establish the necessary level of response, allow immediate communication with the organizations identified in the emergency plan in accordance with the required level of response, confirm the dispatch of each responding organization, establish the use of standard terminology in communications, and establish the use of the appropriate radio frequencies as set out in the emergency plan.

An airport operator has a responsibility to specify the airport communication equipment testing procedures, a schedule for the testing, and the method of keeping records of the tests.

An airport operator has a responsibility to specify the location of the emergency coordination center used to provide support to the on-scene controller when ARFF is on the field.

An airport operator has a responsibility to describe the measures for dealing with adverse climatic conditions and darkness for each potential emergency.

An airport operator has a responsibility to describe the procedures to assist persons who have been evacuated if their safety is threatened or airside operations are affected.

An airport operator has a responsibility to describe the procedures respecting the review and confirmation of emergency status reports, coordination with the coroner and the investigator designated by the Transportation Safety Board of Canada regarding the accident site conditions, disabled aircraft removal, airside inspection results, accident or incident site conditions, and air traffic services and NOTAM coordination to permit the return of the airport to operational status after an emergency situation.

An airport operator has a responsibility to describe the procedures for controlling vehicular flow during an emergency to ensure the safety of vehicles, aircraft and persons.

An airport operator has a responsibility to specify the procedures for issuing a NOTAM in the event of an emergency affecting the critical category for fire fighting if ARFF are available on the field, or changes or restrictions in facilities or services at the airport during and after an emergency.

An airport operator has a responsibility to describe the procedures for preserving evidence as it relates to aircraft or aircraft part removal, and the site of the accident or incident in accordance with the Canadian Transportation Accident Investigation and Safety Board Act.

An airport operator has a responsibility to describe the procedures to be followed, after any exercise, or the activation of the plan, a post-emergency debriefing session with all participating organizations, the recording of the minutes of the debriefing session, an evaluation of the effectiveness of the emergency plan to identify deficiencies, changes, if any, to be made in the emergency plan, and partial testing subsequent to the modification of an airport emergency plan.

An airport operator has a responsibility to describe the process for an annual review and update of the emergency plan, describe the administrative procedure for the distribution of copies of an updated version of the emergency plan to the airport personnel who require them and to the community organizations identified in the plan, and describe the procedures to assist in locating an aircraft when the airport receives notification that an ELT has been activated.

An airport operator includes in the airport emergency plan a copy of signed agreements between the airport operator and community organizations that provide emergency response services to the airport and an airport grid map.

A Safety Management System (SMS) is a process oversight system of all areas of airport operations. The challenge with an Airport Emergency Plan (AEP) is not all required responsibilities, and a conglomerate of interactions, but that the AEP must be scaled down to size and complexity of the airport. Unless the AEP is scaled, the airport operator is in non- compliance with a regulatory requirement that a safety management system is adapted to the size, nature and complexity of the operations, activities, hazards and risks associated with the operations. The key to success is to scale down to a common denominator with combined tasks.

Catalina9


Sunday, August 8, 2021

Your Safety Data System

Your Safety Data System

By Catalina 9

The regulations require that an airport or airline operator implement a safety data system, by either electronic or other means, to monitor and analyze trends in hazards, incidents and accidents. Regulations are scalable and paper format as other means is included to monitor and analyze trends. At some of the smaller airports with only one or two persons managing and maintaining the airport the paper format may work for that size and complexity. For airports with three or more workers or larger airports and airlines, it becomes a humongous and labor-intensive task to conform to regulatory compliance by monitoring and analyzing trends using paper documents.

Unless there is tangible action the SMS is only empty words

The Safety Management System (SMS) is more than data point entries and designing graphs. SMS needs to be built up by a safety data system with tangible actions and results. A safety data system must be autonomous, preserve its integrity, it must be flexible and scalable to size and complexity, or tailored to operational needs. In an autonomous safety data system there is task completion, performance reliability and performance analytics. Performance analytics is the engine, or system, that uncovers insights and reveals hidden value to define new, targeted learning interventions. The result is learning spend that helps aviation safety achieve key objectives.

A requirement for a safety data system is that it acts as an inhibitor against corruption, subjectivity or bias. Corruption is when a system may, intentionally or unintentionally, being altered causing a different outcome. Subjectivity is when someone has a personal interest, or an agenda to manipulate the outcome of data collected, or of facts discovered. Bias is prejudice of outcome based on an assumption or an opinion about someone, or something, simply based on past history. The differences between corruption, subjectivity and bias, is that corruption could be an error, mistake or intentional action, subjectivity is personal to the outcome where facts are ignored, and bias is a decision made prior to an investigation or fact finding mission. A safety data system must prevent these opportunities to occur within its system.
A paper format safety data system is inherent corrupted by self-degradation over time. A document may be legible one year but totally unreadable the next year. Paper documents can also be altered or lost. In an operation with two workers only, such as the Accountable Executive and Airport Manager/ SMS Manager, a paper format may work since any changes are traced to one or the other. If there are three workers, a conflict of interest may arise. Paper documents is an available option under the regulation to accommodate for the simplest common denominator which is one aircraft and one person, or one airport and one person.

Electronic spreadsheets is an option often used by airlines and airport operator as their safety data system. Just as a paper system, an electronic spreadsheet system may also be corrupted, subjective or biased to the facts. A safety data system that in not corrupted, subjective or bias starts with the SMS safety policy. An effective Safety Policy is a tool to manage corruption, subjectivity or bias and must be tailored to the organization so all personnel can recognize accountability, accept accountability for the policy and take ownership of it. Without ownership of the Safety Policy, the policy is an ineffective tool and in itself a hazard to safety.

When there is no accountability to the Safety Policy it becomes more important to adhere to the text in the policy rather than the intent of safety in operations. When the text itself is paramount in the decision-making process, a grammatical error has in the past become the determining factor for a regulatory SMS finding. When selecting a safety data system, the two most important functions to consider are the probability of file deletion, or alternation and the simplicity of reporting. In its simplest form an SMS report should accept a submission with one or two pictures only. A system where files can be deleted by an operator does not preserve the integrity of the system. Files must remain in the safety data system for as long as they are applicable to operations, or personnel, at which time they may be archived, but still available for retrieval. An example would be the Canadian CADORS files for an airport. The airport may analyze CADORS for the past five years, but after 5 years and 1 month, the 1 month may be archived.

If advertising does not work, social media does not affect safety in aviation


CADORS are as much a part of the data collection system as any other report. If CADORS are excluded from the hazard register, an airline or airport operator is operating with a corrupt safety data system and a skewed analysis. Public complaints are also a part of the hazard register, since public opinions affects how the regulator views an airline or airport. As an example, it was not long ago that the regulator revoked a certificate with unsubstantiated findings, or findings added after the inspection, due to public opinion of the operator. Another example is how the public opinion affected a CADORS to be biased against a smaller operator and gave an excuse for the airline’s on-time departure record. The excuse why the airliner entered the runway for backtracking when the smaller aircraft was on base leg was that they needed a VFR departure since the IFR clearance was going to take a while. When the smaller aircraft turned onto final, the taxiing aircraft was ¾ of the way down the runway for takeoff and declared that they had vacated the active runway when they were parked in the turnaround bay. An unbiased CADORS would have stated the facts, which was that a small aircraft had to make an avoidance maneuver due to an airliner backtracking on an active runway. There are several examples of how public opinions or social media affects the CADORS. For an airport or airline to preserve their integrity and fight for their regulatory conformance, CADORS must be investigated and filed in their hazard register. Another short CADORS example shows how social media or public opinions make it into the CADORS [redacted]: An aircraft flew directly overhead an airport [small private airport] northbound, at an altitude of approximately 1000 feet above ground level and a rate of 151 knots ground speed without making any radio calls. Video evidence is available.  

Your Safety Data System must be integrated as a winning combination of your quality control and quality assurance system for incremental safety improvements. SiteDocs is a winning safety data collection tool. Data collected must be preserved and include reports that are both favorable, and unfavorable for your operations. A Safety Data System is more than just collecting and filing reports, it is a tool for the Accountable Executive to learn and comprehend safety in operations and review how lessons learned are derived from the Safety Policy. 

 

Catalina9

Monday, July 26, 2021

$ Money Talks $

 Money Talks

By Catalina9

One could define risk management as the identification, analysis and elimination of those hazards, as well as the residual risks that threaten the viability of an enterprise. The discussion if it is possible or practical to eliminate hazards are ongoing with opposing views. Airports and airlines accept the inherent risks in aviation every time there is a movement on the field or in aeronavigation. On the other hand, both regulators and professional auditors, expects from the corrective action plans that an operator make changes to ensure that an occurrence will never happen again. While it is unreasonable to expect the complete elimination of risk in aviation, it is also unreasonable to expect that that all risks are acceptable. It is a fine line to balance between what risks to eliminate, and what risk to accept. Risk acceptance, or elimination is a 3D identification process measured in time (speed), space (location), and compass (direction). When 3D thinking is introduced, a future scenario can be designed, or the exposure level. Risk mitigation then becomes an exposure level mitigation and not the mitigation of the hazard itself.  This does not imply that the future can be predicted, but it implies that data, information, knowledge, and comprehension are vital steps to predict hazards that affect operational processes. Exposure level mitigation is currently a major part of risk mitigation, e.g., airside markings, markers, signs or lighting, or aeronavigation flow into congested airspace and for gate assignments. 

Risk in aviation are the common cause variations, which are variations within a process, and required to be a part of the process for the process to function as intended. An example of a common cause variation is the runway friction. Without runway friction landings and takeoffs would not be possible. For an air operator, runway friction becomes a special cause variation with rain, snow or slush. Special cause variations are mitigated to an acceptable exposure level. The difference between a risk and a hazard, is that a hazard is one item and the effect it has on safety, while the risk is a conglomerate of hazard probabilities in a 3D scenario with a combined effect of safety.

Let’s take a moment and analyze the probability of the probability of a midair disaster involving two aircraft departing 350 NM apart and travelling to two different destinations in a non-congested airspace. If a risk assessment was done of a midair collision prior to departure, the assumption is that both assessments would accept the risk and defined as a green color. In this first risk assessment the planned departure times and destinations of the other aircraft was unknown. An inherent risk in aviation, or common cause variation, is that the 3D position of other aircraft flying in accordance with the visual flight rules (VFR) are unknown. In an instrument flight rule (IFR) environment, the position of other aircraft, or their estimated 3D positions are known and mitigated. In an IFR environment the exposure level is mitigated to an acceptable level. In a VFR operational environment, the exposure level is unknown until communication between pilots are established, or visual contact has been established. 


Safety in aviation is the strategic game of moving hazards.
 Two aircraft may be on collision course   without knowing of each other.   Depending on aircraft design, an   approaching aircraft may be in a blind   spot for several minutes, as it was for   flight 498. An exposure level may last   for  several minutes, or only for a split   second. When the 3D location is   unknown, the exposure level is   unknown,  even if two aircraft are on a   certain collision course. In 2012 two   aircraft departed 350 NM apart for   different destinations and crashed   midair.  A 3D location could have been   calculated if their altitude, track and groundspeed were known. However,

flying VFR and relying on visual or audio clues is an inherent risk, or a common cause variation in aviation. A common cause variation transforms to special cause variation when one or more of the other systems are malfunctioning. The investigating authority defined a weakness of the see-and-avoid system for VFR flights. A secondary system malfunctioning may have been the position reporting system when departing an altitude or communicate their intended VFR approach procedure.

The safety cycle in aviation is safety, operations, and accounting. When a student pilots take off for their first solo flight, their primary concern is safety and that their first landing will be a safe landing. What their general flying skills are or what the cost of the airplane is, becomes secondary to safety. When safety is achieved and the student pilot is proficient in landing, they are focusing on cross country skills and flights beyond sight of the airport. As more time is accumulated equals more money spent. Eventually, money becomes the governing factor of flying. 

Safety is Project Solutions Leadership Motivation

The principle, or cycle of safety, operations and accounting is a cycle that airlines or airports go through at regular intervals. When first starting up as an airline, their primary concern is safety, including new upstarts of low-cost carriers. Without safety processes in place, they would not qualify for the operations certificate. When SMS was regulatory mandated, airlines and airports went overboard to ensure safety compliance. As they move forward, customer service is added to safety in operations, but eventually, their capacity limits out and cost becomes the determining factor. A regional airline spent more than $750,000.00 within a short time to ensure safety compliance. Eventually the accounting department focuses on cash spent on safety and demands reductions in spending. At first this seems reasonable and acceptable, but over time this drift eliminates critical tasks and moves the operations closer to the fine line between safety and incidents. Several years ago, a regional operator, who had not experienced a fatal accident in 35 years, had their first fatal accident because they relied on prior years track records which had included safety processes. With a good track record, it made sense to accounting to reduce cash spent on safety investments. Fail to plan equals plan to fail.    

Safety in aviation is not what accidents or incident did not occur, but it is what the cash return on safety investment is. In general terms, return on investment is the additional revenue, or cash generated. The return on investment in aviation safety is the reduction of cash spent on safety, or negative cash generated. Return on investment of SMS is not the savings by a reduction of accidents or incidents, but the return of cash revenue generated by in-control processes and organizational based safety investment decisions. A CEO of a company works with cash daily and a reduction of quantity is less significant than a higher cash value of the organization. For an airline or airport with 500,000 annual movements or cycles, a reduction of annual incidents from 1,500 to 1,200 is less significant to the CEO and the Board than a reduction in cash spending of 1,080,000.00 dollars. 

When the reduction of cash spent on incidents has a positive impact on the bottom line, the old-fashion cycle of safety may be broken, and continuous safety improvements becomes an available option to the processes. Money talks and when safety is the profit generator, it makes sense to invest in safety.    


Catalina9

Friday, July 9, 2021

Make An Effective Root Cause Analysis

 Make An Effective Root Cause Analysis

By Catalina9

Within an aviation safety management system, a root analysis should be conducted of special cause variations which caused an incident. The two types of variations are the common cause variations and special cause variations. A common cause variation exists within the system itself as an inherent risk and is to be mitigated by applying a risk analysis of a probability exposure level upon arrival at location, direction, or time. Bird migration and seasonal airframe icing are examples of common cause variations. Special cause variations do not exist within the process itself but are interruptions to a process by external forces. Birds or wildlife on the runway, or icy runway are special cause variations, since they are beyond airport certification requirements, and the airport operator is expected to maintain a bird and wildlife free runway environment and a contamination free movement area. However, for an airport operator both bird and wildlife and ice contamination are common cause variations to which they should apply an expected exposure level upon arrival of an aircraft.

The two most common root cause analysis processes are the 5-Why-s and the Fishbone. The fishbone analysis is a visual analysis, while the 5-Why-s is a matrix. Preferred method is defined in the Enterprise’s SMS manual. A root cause output, or corrective actions required, will vary with the type of analysis used and the subjectivity of the person conduction the analysis. The first step in a root cause analysis is to determine if a root cause is required and why it is required. A risk level matrix should identify when a root cause is needed. A root cause analysis should be conducted for special cause variations. However, the risk level of a special cause should be the determining factor for the analysis. For a risk matrix to be both objective and effective, it must define the immediate reaction upon notification, identify when a root cause analysis is needed and define both the risk levels when an investigation is required, and at what acceptable risk level an investigation is conducted.

When conducting a root cause analysis there are four factors to be considered. The first factor is human factors, the second is supervision factors, the third is organizational factors and the fourth is environmental factors. Environmental factors are categorized into three sub-factors, which are the climate (comfort), design (workstation) and culture (expectations). Culture is different than organizational factors in that these are expectations applied to time, location, or direction. Example: A client expect a task to be completed at a specific time at an expected location with direction of movement after the task is completed. Organizational factors are how the organizational policies are commitments to the internal organization in an enterprise and the accountable executive’s commitment.


There is only one root cause,
but several options for selection
  A principle of the safety management system is   continuous, or incremental safety improvements and   an accurate root cause sets the stage for moving   safety forward. The very first step in a root cause   analysis is to identify the correct finding. This might   be a regulatory non-compliance finding, an internal   policy finding, or a process finding. The root cause   analysis for a regulatory non-compliance finding is   an  analysis of how a regulation was missed, or how   an enterprise drifted away from the regulatory   requirement. An example of regulatory non-   compliance is when an enterprise drifts away from   making personnel aware of their responsibilities   within a safety management system. The root cause is then applied to the accountable executive level, who is responsible for operations or activities authorized under the certificate and accountable for meeting the regulatory requirements. The root cause for an internal policy finding is when the safety policy becomes incidental and reactive to events occurrences, rather than a forward-looking policy, organizational guidance maternal for operational policies and processes, a road map with a vision of an end-result. A sign of a safety policy in distress, or a system in distress, is when policy changes are driven by past events, opinions, or social media triggers, rather than future expectations. An internal policy root cause is applied to the management level in an enterprise. The most common root cause analysis is a process finding root cause. This root cause analysis is applied to the operational level. An example could be a runway excursion. With a runway excursion both the airport and airline are required to conduct a root cause analysis of their processes.
The root cause is your compass.

A root cause analysis is to backtrack the process from the point of impact to a point where a different action may have caused a different outcome. A five columns root cause matrix should be applied to the analysis. Justifications for five columns analysis is to populate the root cause matrix with multiple scenarios questions rather than one scenario that funnels into a root cause answer. The beauty of a five-column root cause analysis is that answers from any of the column may be applied to the final root cause, and if it later is determined to be an incorrect root cause, the answers to the new root cause analysis is already populated in the matrix. When the root cause is assigned, it should be stated in one sentence only. It is easy to fall into a trap assigning the root cause to what was not done. However, since time did not stop and something was done, the root cause must be assigned to what was done prior to the occurrence. An example of an ineffective root cause would be that the pilot did not conduct a weight and balance prior to takeoff. In the old days of flying, the weight and balance of a float plane was to analyze the depth and balance of the floats. Airplanes flew without incidents for years using this method. For several years standard weights were applied to personnel and luggage. Applying the standard weight process is similar to applying the float analysis process. Aircraft flew without incidents for years applying guestimates of weight rather than actual weight. At the end of the day, the fuel burn became the tool to confirm if correct or incorrect weight was applied. That a weight and balance was not done is not the root cause. The root cause could be one or a combination of human factors, organizational factors, supervision factors or environmental factors. The next step in a root cause analysis is to analyze these factors to assign a weight score to the root cause factor. 

A weight score is applied to human factors, organizational factors, supervision factors and environmental factors by asking the 5-W’s + How.  Examples of considerations are shown below.


When the root cause has been decided, but prior to the implementation phase of the corrective action plan (CAP), apply a link to the safety policy via objectives and goals by a process design flowchart of the expected outcome. This flowchart is your monitoring and followup document of the CAP for each step defined in the process. 


Catalina9








Monday, June 28, 2021

Illegal Activity, Negligence or Wilful Misconduct

 Illegal Activity, Negligence or Wilful Misconduct

By Catalina9

The Safety Management System (SMS) Safety Policy is the platform on which the SMS is built. The policy is built on an idea, a vision, and expectations of future achievements. A policy is a course or principle of action adopted or proposed by a government, party, business, or individual. An SMS policy follows these same principles and remains in force until the idea, vision, or expectations changes. It is crucial to the success and integrity of a Safety Management System that the Safety Policy is designed to serve the major definite purpose of an enterprise. There is one quality which one must possess to win, and that is definiteness of purpose, the knowledge of what one wants, and a burning desire to possess it. A major definite purpose is the core purpose for the existence of an organization, and the hub where goals, objectives and processes are developed and designed. The more you think about the purpose of your SMS Policy, and how to achieve it, you begin to attract people, opportunities, ideas, and resources that help you to move more rapidly toward your goal. 


Negligence is drift, or abandonmentand invisible in
the daily operations.
A requirement of a safety policy is that it includes an anonymous reporting policy, a non-punitive reporting policy and a confidential reporting policy. A purpose of these reporting systems is to preserve the integrity of the Safety Policy. A non-punitive reporting policy is a commitment by the Accountable Executive (AE) to the event itself and to personnel who were involved in an event that punitive actions are off the table. During the pre-SMS days, punitive actions were applied depending on the severity of the outcome, prior history and expected reoccurrence. Punitive actions were subjective, biased and based on one person’s opinion. A pilot who was involved in an incident expected to be terminated on the spot. There was an expectation by air operators that a commercial pilot should have knowledge and experience to get the job done. Back then, when the weather was low the expectation was to go and take a look and if you see the runway land, but if you don’t, try again. A young pilot did just that and flew an approach to zero visibility, landed and kept the job. Today, in an SMS world, a pre-take off hazard report could have been submitted and the flight cancelled. Then there are other examples of a pilot who was terminated for the operator to look good for their clients when an aircraft on fire was recovered with the first officer frozen on the controls. Punitive actions were an integrated part of a system to improve pilot skills and remove the bad apples. In a non-punitive reporting system, the report may go to those who needs to know, to those who should know and are also disseminated as a summary throughout the organization of events for information purposes.   

A confidential reporting system is when the report only goes to the persons who needs to know, or to a director level within the organization. At the level of directors, the report may go to other than those who needs to know, but the report is still confidential to the director level. The purpose of a confidential reporting system is that the reporting process is within a controlled system and that the contributor has confidence that the report is not shared outside of a director level, or to those who needs to know. A contributor of a confidential report may allow for the report to be shared within the organization, or also contribute to the SMS with videos and clarification of how an incident happened. 

A non-punitive policy is the most often applied policy since it is applied to every single SMS report received as a commitment to the contributor by the AE. There is an ongoing discussion in the aviation industry what makes a non-punitive policy effective. One opinion is to view it from a contributor’s point of view as a job performance assessment when an incident becomes a learning tool for continuous safety improvements. This is defined in a statement where the conditions under which immunity from disciplinary action will be granted. Another opinion is to view this from an enterprise’s point of view when the policy is only applied if an incident does not trigger litigation or legal action against the worker or enterprise and defined in a statement that the non-punitive is not applied if a worker was involved in illegal activity, negligence or wilful misconduct. 

These two opposing regulatory requirements above are supporting the same common goal, which is safety in aviation, but they are opposing views. They are opposing views since one requires definitions of unacceptable behaviors, while the other requires definitions of acceptable behaviors. There is a fine line, and often an invisible line, to balance between accepting an event to be accepted under the non-punitive policy, or for the event to be excluded by the policy. 

Non-punitive policy is a tool for continuous learning.

The foundation of a Safety Management System is a just-culture. In a just-culture there is trust, learning, accountability and information sharing. A just-culture is where there are justifications for actions or reactions. For an enterprise to apply one or the other definition to their non-punitive policy, a safety case, or change management case must be conducted with a risk assessment of their justifications for the application of either of these two definitions. Both unacceptable behaviors, when punitive actions are necessary, and acceptable behavior when immunity will be granted must be pre-event defined in the Safety Policy, with detailed definitions and publications of the five W’s and How in their SMS Manual. The five W’s are to define the process of What, When, Where, Why, Who and How to both illegal activity, negligence or wilful misconduct, and to when immunity from disciplinary action will be granted.

There is no expectation that an enterprise retains workers who shows behaviors of illegal activity, negligence or wilful misconduct. These behaviors could cause the destruction of a successful business. However, SMS is job-performance review and not legal activity review. When an SMS policy states that illegal activity, negligence or wilful misconduct are unacceptable, everything else becomes acceptable. Until the level of these behaviors is reached, the AE makes a commitment to the worker to continue to work. In addition, in an enterprise that allows for any behavior, except for that illegal activity, negligence or wilful misconduct, there is no room for training or continuous safety improvements. On the other hand, in an organization where the conditions under which immunity from disciplinary action will be granted, a defined list of job-performance safety critical areas can be defined and applied. It is crucial to an enterprise to comprehend that even if punitive actions are accepted, there is no regulatory requirement that they must be applied. However, when applied, they must be applied systematically, or evenly to all workers, including senior management. The very first case pursuant to the SMS policy applying the punitive action sets the bar for all future punitive actions.  

When conducting a safety case for which definition to apply to a safety policy, the case must focus on how a policy affects the future of operations and more important, how the policy affects an expanding business. A short term non-punitive policy applied to a single-pilot, single engine operator, or a small regional airport, may restrict the operator to expand into multicrew and multi-engine aircrafts, or an airport may be restricted to expand to multi runways and international traffic. A safety case applies the 5-W’s and How to processes rather than to the issue. As an example, the What question could be asked as; What is illegal activity, negligence or wilful misconduct, or What is the process to establish the baseline for illegal activity, negligence or wilful misconduct. Asking a process question does not eliminate the fact that these behaviors must be clearly defined in the SMS manual. 





Both scenarios require comprehensive pre-defined and published definitions. A concept of the SMS is to pre-define and clearly spell out job-performance expectations. When job performance expectations are undefined until they reach the level of illegal activity, negligence or wilful misconduct, the line when these levels are reached must be clearly defined. Generally speaking, illegal activity is an act committed in violation of law where the consequence of conviction by a court is punishment, especially where the punishment is a serious one such as imprisonment. A definition of negligence is failure to use reasonable care, resulting in damage or injury to another, and a definition of wilful misconduct any act, omission or failure to act (whether sole, joint or concurrent) by a person that was intended to cause the harmful consequences to the safety or property of another person. In addition to general definitions, each sub definition must be clearly defined. When job performance expectations are defined under which immunity from disciplinary actions will be granted, these expectations must be clearly defined. They are defined as Safety Critical Areas with a subcategory of Safety Critical Functions. A comprehensive list could include more 500 events to consider. 

An airport or airline operator must apply the regulatory requirement applicable to their operations. Within a just-culture, or a non-punitive environment, there must be justification for pre-defined actions or reactions. The four principles within a just-culture there is trust, learning, accountability and information sharing. As long as an operator is governed by these principles, they may apply any non-punitive policy tailored to the needs of their operations.  

Catalina9




SMS Authority

  SMS Authority By Catalina9 A Safety Management System plays a role in the organizational charts for both airport and airlines, but withou...