Wednesday, June 28, 2017

The Fork In The Road Test

All roads lead to Rome and there are many different ways of reaching the same goal or objective. Finding the rootcause takes a road trip defining the turns and forks in the road. If travelling by air the course may take a detour by relying on the old ADF, or be more effective following a GPS course. There are several root cause analysis techniques and they all serve a purpose to improve safety and one rootcause model may be as effective, or ineffective as another. All rootcause analysis models are designed to establish at what time or location in the failed process a different approach could have made a different outcome. The 5-Why and fish bone rootcause analysis are widely accepted within the aviation industry and assumed to have established the correct rootcause. A risk assessment of substitute and residual risks is normally conducted after the rootcause analysis to identify if there are other or unexpected hazards by the implementation of proposed corrective action plan in the form of a new risk control strategy. As a compliance criteria, the enterprise monitors the cap with a follow-up as an assessment of the effectiveness of safety improvement.

Without knowledge one fork in the road is as good as another.
Monitoring the effect of corrective action is a standard procedure for follow-up of CAP implementation. Monitoring and follow-up may be dependant on seasonal differences or timeframes for collection of enough data to establish the effectiveness. If an enterprise has lost control of their safety processes and decided to implement corrective strategies The Fork In The Road Test is a tool to identify if steps in the evaluation process are taking shortcuts and jumping to a conclusion that the new strategy is the correct strategy. This shortcut is an attempt to break the wall in a maze to make it to the end of the tunnel without following the process path.

The Fork In The Road Test is to backtrack the process to establish where in the maze the failure of the wall was  and to establish the time and location in the future where the missing link of a CAP is. This does not imply that an incident or accident can be predicted, but it implies that knowledge is vital to predict the hazards affecting the process.

When building and operating out of an ice-runway there are many considerations affecting the design of the runway. Since the runway thaws out every year, it must be rebuilt the next season and located at the exactly same location to be validated as the same runway and applying the same instrument approach procedures. In addition to the runway itself, the ice movement over time offset the precision approach to a point where it becomes unusable. When applying The Fork In The Road Test to the runway the time begins at the time when the ice melted and the location begins at the location where the ice threshold was located.

It becomes simple to see The Fork In The Road if an aircraft landed on the ice in the spring when most of the ice had melted. The time can is backtracked and establish that a change in the direction, or taking a different turn at the fork in the road at that time would have made a difference. The Fork In The Road is not the time and location when the flight crew had to make decision to change flight path, but at what time and location the aircraft could have been expected to complete the flight without an incident. When the ice is melted the aircraft was doomed at the time of departure. On the
The Fork In The Road does not always take a straight path.
other hand let’s assume that the accident happened in the middle of the winter with a foot of ice and minus 45 degrees temperature. At this point it becomes a scientific task to establish where the fork in the road was. The task becomes to identify if there were special variations that caused the change of course or an incident, or if there was normal variations that were overlooked. E.g. melting ice would be a normal variation, blowing snow would be a normal variation, darkness would be a normal variation and ice-ridges would be a normal variation.

Let’s for a minute assume that the airplane hit an ice-ridge. This establishes the Fork In The Road at a time prior to the aircraft departed from civilization. The Fork In The Road is the one trigger that would, without doubt, made a difference for safe operations. In this scenario, it would have made a difference if the normal variation had been identified and runway inspected and assessed as safe for operations. The Fork In The Road Test is not applicable to events after departure since the departure itself locked the aircraft into a path where at some point in time the flight crew would have to make a reversal decisions or an incident would happen. The Fork In The Road Test had predicted a hazard of normal variations, but since the Fork In The Road Test was not applied the hazard was not identified.

At the world’s most dangerous airport a siren sounds about 45 minutes prior to arrival and before the aircraft departs for this destination. The Fork In The Road has been identified and the corrective action implemented at the time and location where it effectively makes a difference and aviation has become safer. The Fork In The Road is in the planning and decision to complete a pleasant flight.

CatalinaNJB

Saturday, June 17, 2017

SMS Communication

A small operator communicates differently with their personnel than how a mega-enterprise would communicate. An airport with 2 or 3 people, being an Airport Manager, SMS Manager and Accountable Executive may communicate verbally without much documentation, while a larger airport may use multiple levels of communications processes. Both operators must meet the same requirement of the expectation that communication processes (written, meetings, electronic, etc.) are commensurate with the size and complexity of the organization. When applying this expectation without ambiguity, or applying the expectation with fairness to the expectation itself, both operators are expected to apply the exact same processes in communication. The simplest avenue when assessing for regulatory compliance is to apply the more complex communication processes to both operators. When applying this approach, the smaller airport’s SMS becomes a bureaucratic, unprofessional and ineffective tool for safe operations.

Small airport communication has also changed with the times.
At first glance it looks great that a small airport is expected to operate with different communication processes than a large airport, but when analyzing all available options, there is very little tolerance, or none, to apply this expectation in any other way than a prescriptive regulatory requirement. It is only when the operation is understood that the expectation can be applied with ambiguity, and with unfairness to the expectation itself, for an effective communication process for any size and complexity airport operators.

When there is a finding at a small, or large airport that the communication did not meet the regulatory requirement through this expectation, in that the information had been forgotten, misplaced or incorrectly interpreted, the operator is required to identify policies, processes procedures and practices involved that allowed for this non-compliance to occur. This in itself, that an operator allowed for a non-conformance to occur is a statement of bias in the finding implying that the operator had an option not to let this non-compliance occur. If this option was available at that time, the operatory would have taken different steps. The reason the non-compliance occurred is that the option to make a change was not available at the time when it occurred. All systems within the SMS were not function property and often it is the systems of human factors, organizational factors, supervision factors or environmental factors. Reviewing a finding in 20/20 hindsight is a simple task and to point out what could have been done differently becomes the task of applying the most complex process. However, when the operations is in the moment, the options at that location and point in time are limited to snapshots only of information, knowledge and comprehension of the events.

Human factors in communication is today integrated in automation and not visible.
Since there is an assumption in the root cause analysis requirements that the non-compliance was allowed, the question to the finding is no longer what happened, but why the operator allowed this event to escalate beyond regulatory requirements. The difference between a “what” and a “why” question is that “what” is data and “why” is someone’s opinion. When the finding is issued during an audit by the regulatory oversight team, the answer to the “why” question becomes the opinion of one person only of that team. When the opinion of that person becomes the determining factor of a root cause analysis it has become impossible to analyze the event for a factual root cause.  It could also be that the answerer to the “why” question becomes a compromised, or an average of the views of all inspectors, in which the answer is no longer relevant to the finding, but to a process where all opinions are considered. On the other hand, when the “what” question becomes the determining factor, each link to what happened must be supported by data and documented events. If the events of the “what” cannot be answered first, or before any other questions are answered, it has become an impossible task to assign a root cause to the finding. When the “what” question is answered, then a change to the “what” may be assigned and implemented.

This does not imply that the “why” question is not to be asked, but it becomes a factor of how the “why” is asked, and if the determining factor to the “why” question is an agreement between several people in a group to assign an average of indifferences, or if the “why” question is answered to the “what” question. When applying the 5-why process, the answer, or root cause, is established by the answer to the first “why”, since the rest of the answers must be locked in to the first. The more effective root cause analyses are the “fish-bone”, the “5-why matrix”, or the “fork-in-the-road” test.

When the requirement of the second expectation, as stated in the root cause analysis document, that the non-compliance was allowed, it changes the first expectation within an SMS element of different communication based on size and complexity of the airport to a prescriptive regulatory requirement. The prescriptive requirement then becomes the common denominator for the event that was allowed to occur and must be applied to the most complex communication process. The simplest way to look at this is that when the “allowed to” is allowed to be applied to an event, it is assumed that human variations do not exist and that the system is operating in an undisputed perfect virtual environment.

CatalinaNJB

Friday, June 2, 2017

No Data, No History, No Event

Root cause analysis is to find the single cause of why an unplanned event happened, or a link in the process where a different decision would have made a different outcome. This does not necessarily imply that a different outcome would have avoided the unplanned event, but it may have happened at a different time or location and with a different outcome. The expectation of a different outcome is that the unplanned event would not happen.

When analyzing for the root cause the 5-Why process is often applied. Unless there is an unbiased process applied to the answers of the 5-Why process, the desired answers could be established prior to initiating the process and the answers are tracked backtracked from this desired answer. The fact of this is that most 5-Why processes only allows for one option for the root cause. Since the organization is determined to establish a root cause, the root could be established without applying the 5-Why process. This is the “checkbox” syndrome of establishing a root cause by applying the approved root cause analysis. The assumption is that as long as the paperwork looks good it must be the right root cause and operations must be safe, correct? No, this is not correct. An incorrect root cause is more unsafe than a know, but non-effective root cause, since the new and incorrect root cause has not been tested and the outcome is unknown. Assuming that the new root cause is effective is to assume that opinions are facts.

Find the roots that feeds life into the process.
A root cause analysis must include data from prior documented events. If there is no data, no history or no documented event a root cause analysis cannot be based on past experiences. A onetime event is not a trend and applying a root cause analysis to one event defeats the purpose for the safe operation of an airport or aircraft. If there is no data, there is no trend and are no prior events to compare to the analysis to. The key to success is to establish data and trends to determine the root cause and make changes to the processes to reduce or eliminate another unscheduled event or failure. E.g. should a runway edge light fail and there is no data of prior failures, the short term fix is to replace the lightbulb. This might not be what the regulators wants to see, but the fact is there is no data to justify a root cause, and, in addition, there is no data to justify that the burnt our light is not an acceptable risk. Over time an airport may track the burnt out lights (which is data) and over a period of 3-7 years establish a pattern of malfunctioning lights. With this information the airport may establish the root cause and change the lights at a reasonable time prior to the bulbs are expected to burn out. It’s as simple as that.

Another option is to apply best-practices or continuous safety improvement by collecting data from the light manufacturer of how many hours or cycles a runway edge light is expected to last. If this was done, a process to change these lights prior to lights burn out could be reduced from 7 years to 6 months and their safety goal to minimize burnt out lights achieved in a short time. By applying the data supplied by the manufacturer a 5-Why analysis may not even be necessary to establish the root cause.

When only one option of questions for a root cause the question must be answered first.
Let’s assume that an airport took the best-practices route and established a lifetime for runway edge lights. However, the lights still burned out before expected and became a frustration to airport management and an inconvenience for their customers. The next step is to collect data for a root cause analysis. In the process to decide what approach to take to collect data the 5-Why Matrix was applied. The result from this matrix was to mount wildlife cameras at the airport to see if there is any wildlife connection to the burnt-out lights. Over time it was discovered that the coyotes came and chewed the power cable and that the lights therefore burned out about 2 days later. This data could now be applied to a root cause analysis, or the location of the fork in the road, and the process of transferring power could be improved. In other words, but burying the cable underground and cover it up to the light, the long term corrective action had extended the intervals of replacing the lights.

Without data, there is no event, only opinions of events. Applying a straight 5-Why does not necessarily establish the correct root cause, since the answer is locked in after the first question is answered. For the 5-Why process to be more effective the application of a matrix moves the process out-of-the-box for a nonbiased result.


CatalinaNJB