Friday, January 5, 2018

Frequent Audits or Enhanced Monitoring

The latest news reports for airlines and passenger jet-aircraft are that 2017 was the safest year on record for commercial air travel with zero fatal accidents. When counting cargo planes and passenger turboprop aircraft, there were both fatalities to passengers and bystanders on the ground. This safety record may be contributed to several factors. Some might say that this is due to the implemented Safety Management System (SMS) in aviation, while other might say that this is due to enforcement of regulatory compliance. What to remember is that there are no self-made safety operators. Safety is a conglomerate of independent, but cooperative systems, processes, procedures and organizational policies.

When an accident happens, it is an over-simplified solution to immediately place blame on the safety management system, in the same manner as it was over-simplified solutions to always blame
A stable system could be effective for years to come.
accidents on pilots. It could be that there was a system failure, and it could be that it was the pilot’s fault. However, without data to justify this root cause, a safety conclusion has been drawn based on bias opinions without facts.

There are inherent risks in aviation. These risks are caused by common cause variations and special cause variations. Simplified, common cause variations could be identified as system failure, while special cause variations are variations due to unpredictable events. An effective Safety Management System is to operate with safety processes at a level above the bar of minimum acceptable level of safety and where anyone who demands process changes comprehends the Air Operator’s safety processes. Without comprehending the SMS there is a danger of over-controlling stable processes, or tampering with the processes. Over-controlling to change a result that is undesirable, or to make good SMS processes better could be due to demands by the regulatory authority, customer’s demand, or the operations itself.  The result of over-controlling could generate a drift into higher probability of special cause variation to occur and generate a higher risk level which could cause a non-scheduled event.

Customer service and regulatory compliance complement each other and do not contradict safety performance. Regulatory requirements form the structure necessary for safety in operations, while customer service drives the safety processes required for regulatory compliance and continuous customer satisfaction improvements. Regulations in themselves are not safety in operations requirements, but requirements for compliance in a static environment. This can be described as the issuance of an operator certificate, which is issued (and must be issued), prior to any flights take place. At the time and location when an aircraft moves under its own power is the beginning of operational safety, or continuous customer satisfaction. The task then becomes to maintain regulatory compliance and in addition, have processes in place to produce the best possible safety outcome. The SMS regulatory requirements are still applicable to static operations and become the accountability of the operators to apply SMS in their systems, policies, processes, procedures and expectations to monitor how their operational processes conform to regulatory requirements while in motion. When it is established that there is compliance with regulatory requirements, then an operator may move into continuous improvements of customer service, or safety, while still maintaining regulatory compliance. The Safety Management System is the NextGen of aviation safety with tools to monitor safety performance.

Occasionally, both customers and regulatory authorities have vested interest in monitoring operations
Viewing from the fence is a process to see the overall picture.
of a service supplier or certificate holder. This can be accomplished by Enhanced Monitoring. When a customer takes on Enhanced Monitoring of their service provider, any findings are submitted to the operator as customer concerns. Should a customer demand changes in operational processes the outcome might be that a stable process could, by drift, produce a special cause variation and an unexpected event.

When the regulatory body takes on findings during Enhanced Monitoring they demand that an operator make changes. The regulator takes the driver’s seat while enhanced monitoring is applied, and an operator does not have the authority to make their own safety decisions. An operator must run all changes via the regulatory authority. This is how it must be, since the duty of the regulatory body is to provide services to the public and to ensure the interest of public safety is maintained. The regulator has an obligation to accept the risk on behalf of the public. The regulatory body has the authority to revoke certificates and must therefore be directly involved in operational management during the time of their Enhanced Monitoring.

Enhanced Monitoring is a series of mini-audits and sampling of processes. The organization or person conducting enhanced monitoring must be onsite to collect real-time data and comprehend safety system processes as they are identified by the operator’s SMS. What Enhanced Monitoring is not: It is not a review of data produced by the operator themselves, since this data in itself is bias (this doesn’t imply that it is wrong or incorrect) and the regulator has not fulfilled their sampling of data or frequent mini-audits requirements to accept the risk on behalf of the public. Enhanced Monitoring only of an operator’s data is nothing else by a task to review intent.

Enhanced Monitoring puts the pieces in a system where they belong.
A true enhanced monitoring is to collect real-time data, produce data into information, gain knowledge from this information and comprehend their SMS by analyzing information. In addition, enhanced monitoring takes on human factors, organizational factors, supervision factors and environmental factors. In short, enhanced monitoring is to sit on the fence to take notes and view the game of operational safety, which does include several common cause variations, and to view with the intent to spot any special cause variations that could cause unscheduled events. It is impossible for any deficiencies in the company's Operational Control System to escape scrutiny during an enhanced monitoring period. An effective enhanced monitoring takes at least three months to complete, depending on size and complexity of the operator. It crucial to safety that any operator, being an airline or airport, conducts ongoing and frequent internal mini-audits of their operations as their own internal enhanced monitoring.

Friday, December 15, 2017

Santa’s Run 2017

Santa’s Run 2017

Santa is getting ready for another journey of delivering gifts all over the world. Over the many years that Santa has delivered, he has experienced incidents and mishaps, but never had a serious accident that would stop him from providing excellence in customer service. Some years ago, Santa implemented the Streamlined Mission Service (SMS) in an attempt to reduce the mishaps and incidents. Most of these mishaps happened on rooftops as the reindeers came in for landing hitting a chimney or had a rooftop excursion. Over the years the gifts had become more sensitive to these incidents compared to the old-days when gifts were more robust, and an incident didn’t do any damages. The SMS was sold to Santa with a promise that when fully implemented he would have fewer incidents than he ever had before.

Since Santa had only managed safety on-the-fly, he had concern about the new way of SMS, but he reluctantly implemented it and was looking forward to many seasons with fewer incidents, which didn’t happen. When reviewing the previous years, Santa had more incidents per landing than ever before. He was disappointed; SMS had not followed through with the promise of fewer incidents. He was determined to put the SMS on the Elves-Shelf and totally dismiss SMS forever. So, instead of preparing for a safer journey this time, he sat down with a good book about the history of flight.

Even with an SMS in place there were unforeseen variables.
Well, Santa had flown his reindeer long time before any person invented the airplane. Except that nobody seen Santa and his reindeers flying, they just seen the tracks after he had left the gifts.  As he was reading something caught his eyes: Safety Management System was not new to aviation, but started in 1903 with the first flight. Back then, SMS was all reactive and safety was not improved until after an accident had happened. For the first 100 years or so, SMS in aviation was reactive processes and applied to Technical Systems only. With the implementation of SMS, Aviation Safety became proactive and was applied to the Human Factors System. Interesting thought Santa and said to himself that this almost makes sense. He was determined to learn more about what SMS actually is, and if it could be a useful tool to improve his gift-delivery customer service safety record.

As he continued reading the book, he realized that by stepping back from the operations itself and enter into a parallel virtual world, he could review operations from the sideline. He sat down on the fence and just observed his operations during the many previous decades of deliveries. 
Hour after hour he sat and observed previous missions and discovered that he had been operating with a Streamlined Mission Service (SMS) ever since he started the operations. Over the years improvements were made to the operations for safer transportation and higher quality of customer service. One major improvement was when the lead reindeer, Rudolph, received the red nose. The red nose became a requirement after a near-miss with another flying reindeer and should help for collision avoidance with other flying creatures. When they see the red-nose they would be able to tell by that red-nose that Santa was coming their way. The only reason for the red colored light was that this was the only colored light-bulb Santa could find in his film and picture development room.  But it worked. He discovered that since that time there were fewer incidents per approaches and departures than ever before. Santa had discovered the true SMS, which is that SMS is not the magic wand to prevent incidents, but two separate systems to be applied as safety tools.

After a long day of deliveries, Santa and the reindeers took the flight home.
One system is as a parallel system to the operations with a purpose to observe the quality of operations and collect data and the second is as a system for continuous improvement to the reindeer's’ Hoof Factors (HF) system. By improving the quality of the hooves, the reindeers could focus better on the quality of landing surfaces and landing spots than have to worry on the damages to their low-resistance hooves upon landing and during critical phases of flight. By applying he HF system, Santa was able to change the reindeers’ safety-culture, which had been to force more hoof-weight onto the other reindeers in the group. With the SMS each reindeer was accepting their individual accountability to distribute the landing-force evenly to all hooves.

After learning more about SMS, Santa changes his opinion about SMS and realized that it is not how many incidents that can be avoided, but how each mission is a mission to continuously improve safety.

By this discovery Santa was prepared for another year of world-wide gift delivery. In addition to improve safety, he discovered that improving safety actually had improved his customer service by more on-time deliveries and fewer hours in the shop to repair the HF system. 

When Santa calculated the return of customer service satisfaction, he applied the formula of the continuous improvement factor and apply this factor in other areas of the operations.

Santa hurried back to Mrs. Santa and the Elves and presented his discoveries. They were all excited and supporting Santa in that it’s not the utopia of perfect deliveries that counts, but how every year becomes an operational safety improvement. Also, Santa understood that the SMS sales-gimmick of magically having fewer incidents was not intended to be wrong information, but just that the elves who sold him on SMS did not have any practical SMS experience.

Wednesday, November 29, 2017

When SMS Becomes Personal

Safety Management System is not new to aviation but started in 1903 at the moment of the first flight. Back then SMS was all reactive and safety was not improved until an accident had happened.
When SMS is directed from the bow.
For the first 100 years or so of aviation history, SMS in aviation was reactive and reactively improved safety processes after the fact. Over time, aviation industry leaders believed the airplane could not reach its full commercial potential without federal action to improve and maintain safety standards. With the implementation of a landmark legislation in 1926 the issuance and enforcement of air traffic rules, licensing pilots, certifying aircraft, establishing airways, and operating and maintaining aids to air navigation became available.

Despite this, in 1926 and 1927 there were a total of 24 fatal commercial airline crashes, a further 16 in 1928, and 51 in 1929, which remains the worst year on record at an accident rate of about 1 for every 1,000,000 miles flown. Based on the current numbers flying, this would equate to over 7,000 fatal incidents per year. Aviation was not considered to be a safe mode of transportation.

SMS is to know what options to balance.
In 1956 one of the worse accidents mid-air accidents happened over the Grand Canyon, with the result of creating more rules to prevent identical accidents. There was no indication of wrongdoing, or non-compliance with regulations by cancelling IFR and flying 1000-on top. Shortly before 10 a.m., both pilots reported to different communications stations that they would be crossing over the canyon at the same position at 10:31 a.m. The Air Traffic Controller was not required to issue a traffic conflict advisory to either pilot and was, in fact, prohibited from doing so. It was the sole responsibility of the pilots to avoid other aircraft in uncontrolled airspace.

Air safety regulation as we know it today has been shaped by aircraft disasters that have happened in the past. Any given aviation disaster can be attributed to human failure, technical failure, extreme weather, or sabotage. Over time all these factors were as good as eliminated from aircraft accidents. Aviation had become the safest mode of transportation available. In search to further improve safety, the Safety Management System in aviation was implemented as a regulatory requirement to address human factors. Since all other systems had been improved, the time was right to improve the human factors system.

However, when the SMS was seen as the last link to create the utopia of safety in aviation it became the failure of aviation safety. SMS in itself could not and cannot fail, since it is a parallel system and an observing system of applied processes and not a system of operational control, but a system and tool to manage operational control. When applied correctly, SMS is the tool to discover flaws and apply corrections and not a tool to create the utopia of safety.

There are several articles written and surveys conducted placing a negative view of the Safety Management System. When SMS is looked upon as the one solution to bring utopia of safety into flying it will fail in the eyes of the beholder. In addition, if biased and personal opinions are applied, an effective SMS could easily be described as a disaster to safety.  This is simply because an effective SMS describes and paints a picture of how safe the operations are and to what confidence level an operator can support safety by data. When these articles and surveys describe SMS as being un-safe, or not useful at all, their statements are describing the operations itself and not the SMS. SMS is a system analyzing personal behavior and it becomes easier to attack the messenger than accept the facts of personal behavior that SMS had already discovered. When SMS becomes personal it sets the stage for operational failure and not the failure of the SMS.

Wednesday, November 15, 2017

Possibility or Probability

That there is a possibility does not imply that there is a probability.
Possibilities are variables while probabilities are facts. Probabilities vary due to the effect of possibilities. A possibility is often applied to safety as a fact with an undisputed path for an event to occur. Possibility is the expression of a desire for an event to occur, while probability is an analysis of facts to establish a likelihood level of an event to occur. When possibilities are applied to regulatory compliance operations gradually become a system failure, or a dysfunctional operation, while with the application of probabilities the operations improves their safety, or functional systems and operates with an effective SMS.

There is a possibility that all marbles remain in the bowl, but a low probability.
Probabilities are levels of the likelihood of one possibility, or the confidence level of a prediction that an event will occur in the future.
There is only one possibility applied, but this possibility is applied to the different criteria of likelihood. Likelihood is defined in many shapes and forms. One method is to define the likelihood of ten independent levels based on a time-frame between events.

Likelihood levels could be defined as follows:
A) Inconceivable
Times between intervals are imaginary, theoretical, virtual, or fictional.
B) Rarely
Times between intervals are beyond factors applied for calculation of problem-solving in operation.
C) Remotely
Times between intervals are separated by breaks, or spaced greater than normal operations could foresee.
D) Randomly
Times between intervals are without definite aim, direction, rule, or method.
E) Variable
Times between intervals are indefinable.
F) Occasionally
Times between intervals are inconstant.
G) Often
Times between intervals are protracted and infrequent.
H) Frequently
Times between intervals are reliable and dependable.
I) Regularly
Times between intervals are short, constant and dependable.
J) Systematically
Times between intervals are methodical, planned and dependable, without defining the operational system or processes involved.

When applying examples to these likelihood levels they become alive and practical in a Safety Management System.

The likelihood is inconceivable.
As an example, for each departure, there is a possibility for an airplane to experience an engine failure just after liftoff. However, the probability that this event occurs is based on data applied to a likelihood level. By applying the possibility to each level of likelihood and pick one level based on data it could be established what effect a possibility of an engine failure has on operational safety for each likelihood level. In this example the possibility is applied to a likelihood level of Randomly.

D) Randomly – an engine failure just after liftoff occurs randomly. These intervals between engine failures are without definite aim, direction, rule, or method.
However, when an engine failure is applied as a possibility only, it becomes a possibility to an inconceivable event and there could not be safety in aircraft operations.

The same scenario holds water when applying one possibility to regulatory compliance.  It could be said, and it has been said that there is a possibility for an operator to be non-compliant with one regulation and therefore a regulatory non-compliance finding could be issued. When applying one possibility without a link to likelihood all operations are non-conforming to regulatory requirements and all operations becomes non-conforming to safety.


Thursday, November 2, 2017

SMS: An Umbrella Or A Wheel

There are many names associated with the Safety Management System (SMS). A Safety Management System is often addressed as an additional layer of safety, but does not address what other layers of undefined processes this is an addition to. This statement is widely accepted as fact without analyzing the other underlined processes.  Several of steady improvements in the accident rate during the lifespan of aviation was attributable to improvements to technology, such as the introduction of more reliable engines and navigation systems. Pilot error, or human factors, were assigned as the root cause of accidents each time there was an accident. This root-cause statement included a statement that a person had failed to comply with a regulation or standard which had been arbitrary implanted by the State. 
Umbrella is a shield of protection and not s system of safety.
More than once a new regulation or standard would be arbitrary implemented after a major accident. Assigning the blame to the flight crew was an easy way out and without accountability to the operational processes.  Continuous safety improvement in aviation had become difficult when applying an approach to assign root cause of an accident to one person only. The task of continuous safety improvement had now become a task to find a flight crew member who would never be involved in a future accident. Since this is an impossible task the Safety Management System was developed to make aviation a more perfect operation with an assigned safety operational confidence level.

This new approach to manage organizational factors, human factors, supervision factors and environmental factors was looked upon as an additional layer of safety to what the aviation industry already was doing. However, the aviation industry had not been doing anything else but to comply with regulatory and standards requirements. When the SMS program was presented as an additional layer of safety, everyone assumed that by complying with this highest level of layer of a hierarchy all other regulations and standards would take care of themselves. It had become an assumption that they were self-regulated by the Safety Management System. 

While the assumption of being self-regulated is a misconception of the SMS, both the operators SMS and the aviation authority oversight system is a part of a complete package. The concept of an SMS is that the operator has processes in place for the safe operations of an aircraft or airport and processes in place to ensure regulatory compliance. When there is an audit by the aviation authority the ideal outcome is that there are zero findings, or an operational zero tolerance to compromise aviation safety.

An SMS is a formal means for operators to demonstrate their management capability to meet their obligation to operate at the highest level of safety in the public interest. While both oversight systems and are highly complementary and interactive, they are both separate and essential components of the regulatory safety management strategy. In other words, SMS is a parallel system and supporting system to the operations.

SMS is to strengthen each spoke of the wheel.
SMS has been described as an umbrella in of the operational safety management system. An umbrella is a tool that covers or protects from above. When applied to the SMS system the umbrella is an overarching system encompassing all other systems within the organization. If this was the fact, that the SMS system is an umbrella, it would take precedence over all systems within an organization. This would cause mass-confusion and inability to manage operations. With the SMS established as an umbrella, anyone can use the safety-card to disable organizational safety management.

On the other side, when the SMS system is looked at as a parallel system to operations and as a tool of a wheel with spokes it becomes manageable and practical in the application of safety. As a parallel system and a wheel of with spokes, the operator may choose to strengthen the wheel by applying more powers to one or the other spokes. SMS is not an overarching operational umbrella system. SMS is a system that is receiving data from operational practices and applying this data to each one of the spokes in the wheel to strengthen the wheel and operational confidence level of safety management. 


Wednesday, October 18, 2017


Accountability is without supervision, to comply with regulatory requirements, standards, policies, recommendations, job descriptions, expectations or intent of job performance and for personnel to be actively and independently involved. Accountability is an element of a just culture, which is an organizational culture where there is Trust, Learning, Accountability and Information Sharing.

Accountability is process in motion.
Accountability is a process in motion and not a static state of virtual events. Accountability is different than responsibility since it is the behavior to trigger events in a form that produces the most positive result. When a person gets the drivers license they have a responsibility to stay on the correct side of the yellow line that is dividing oncoming traffic. This personal responsibility does not leave the person even if the person is not driving a vehicle. It’s a responsibility of the license itself. This is the same for a pilot license or aircraft mechanic where the responsibility follows their licenses. On the highway it doesn’t make safety-sense to divide oncoming traffic with a 6-inch yellow line. However, it is accountability that makes this possible. A driver of a vehicle is not accountable to all and everyone on the road, but only to the first approaching vehicle, then accountable to the next vehicle and then accountable to the next vehicle and so on. Accountability is action in motion. Everyone expects that the other driver comprehends the responsibility and is accountable to safety. When driving down a two-lane highway there must be Trust involved.

Trust is the first element of a just culture. Without trust there is nothing. A pilot is trusted to become a part of the operations, trusted with a single engine bush-plane or a multi-million dollars airplane and carrying one or several hundred passengers onboard who trust the pilot and the flight crew. Without trust there are no flights.

Learning is voluntarily. 
Learning is the second element of a just culture. Trust has given a person an opportunity to apply their skills and knowledge, but they are continuing to learn and excel in performance. At times this learning curve levels off, while at other times the learning curve is steep. A steep learning curve may come from new challenges, but also by learning from indents. Incidents are not a requirement for learning and every effort is made to ensure that every flight goes right, in the sense that everyday work achieves its objectives.

The third element of a just culture is Accountability. Accountability is applied to trust and learning. Accountability is to be accountable to safety by staying of the correct side of the yellow line painted on a highway. If it was not learned what the yellow line communicates, a driver could be zigzagging across the line and if there were no trust the opposing traffic could not maintain safe separation.

Information Sharing applicable to the process.
The fourth element is Information Sharing. After trust is established learning is ongoing and accountability has a track record then Information Sharing is implemented. This information sharing, being internal, with stakeholders or with customers as advertising or operational safety is an operational tool for continuous or continual safety improvements. One fabulous way to improve safety is to share ideas across the board and then implement the best ideas in your own operations. One reason that ideas or demands from a regulatory authority does not work in operations is that a regulatory body is implementing ideas in the concept of reactive accountability and outside a just culture.

Accountability is the backbone of a successful SMS. It is not to be held accountable in a traditional reactive concept, but it is to be held accountable in a proactive concept. When there is proactive accountability there is the ability to succeed under varying conditions, so that the number of intended and acceptable outcomes are as high as possible. Accountability is to harness human factors and human resilience.


Saturday, October 7, 2017

Discussing Numbers

Analysis of data in aviation is an application of historical data to predict future events. This data is applied to predict hazards, but is not available as a tool to predict incidents or accidents. Incidents or accidents cannot be predicted since there is no capability within a data analysis to assign time and location of a future incident or accident.

Reporting hazards is SMS painting a picture of the operations.
The concept of predicting any hazard is the same concept as predicting the hazard of an inflight icing conditions. When an airplane is flying into icing conditions, the hazard of icing is predictable based on area weather forecast and knowledge, but it cannot be pre- assigned to one specific future incident or accident since variations are unpredictable.

A high volume of reports for one operator does not necessary conclude that this operator is a higher risk than another operator.  Raw data of reports are often assumed to be a higher risk and assigned a value without further analysis of the data itself. SMS has become a tool where competitors could apply this as their competitor advantage in a contract bid that they have fewer reports than their competition. Anyone who is unfamiliar with the function of an effective SMS could easily buy into this scam that fewer reports equal lower operational risk. This is backwards of what SMS is. Without data, there is a zero-confidence level of the safety culture, or how healthy operational safety is. On the other hand, with collected data the confidence level of how safe operations is can be established by a statistical process control analysis. However, a high volume of reports does not automatically ensure safe operations, but provides more data for analysis of an SMS to implement processes for continuous safety improvements.

Let’s take a moment and analyze reports from three small airports. These airports are similar in size, operations and movements and are within 200 NM of each other. The first airport reported 7 events the last 15 years, but did not make any report submissions the last 5 years. The second airport reported 66 events during the same period, with last reported event this year. The third airport reported 264 events during the same timeframe as the other two. At the first glance, it appears that airport #3 is a high-risk airport compared to the other two. This raw data does not tell a story or provide any valuable information to make statements of operational confidence level. Only by analyzing the data is it possible to paint a picture of operational safety and make statements related to a confidence level of safety in operations.

Analyzing an SMS more than discussing numbers. 
An initial analysis of the operations shows that airport #1 quit reporting 5 years ago. Based on trends this imply that the airport stopped all reporting and not that the events stopped happening. The reason is not known until a further inquiry into airport operations is conducted.
Airport #2 is continuing to report and the numbers of reports are steadily decreasing in numbers these last 15 years. Airport #3 shows a steady reporting structure chart where the numbers of annual reports are variable, but are moving above and below the average line in the chart. Airport #3 has therefore a healthier reporting culture than the other two airports.

When analyzing facts, opposed to opinions, airport #3 is an airport with accountable process for airport users to rely on information and data from this airport in their decision-making processes. Airport #3 can with a high level of confidence state that they have in place operational processes that paints an accurate picture of their airport operations. The other two airports have no supporting documentation to support their opinions that the SMS pictures painted are accurate pictures of their operations. 

Data, information, knowledge and comprehension of operational processes are vital components for continuous safety improvements. Analyzing SMS is more than discussing numbers, where the group or person with a louder voice and better vocabulary wins the argument. When applying strategies and solutions to airport safety it is not the numbers of events that becomes the issue, but the comprehension of airport operations.