Categoría: Seguridad aérea

Air safety: When statistics are used to kill the messenger

Long time ago, I observed that big and long-range planes -with a few exceptions- always had a safety record better than the one more little planes had. Many explanations were given to this fact: Biggest planes get the most experienced crews, big planes are more carefully crafted… it was easier than that: The most dangerous phases in flight are on ground or near to the ground. Once the plane is at cruise level, the risk is far lower. Of course, biggest planes make long flights or, in other terms, every 10 flown hours a big plane could perform, as an average, one landing while a little one could land 10 times. In a statistical report based in flown hours…which one is them is going to appear as safer? Of course, the big one. If statistics are not carefully read, someone would started to be worried about the high accidentability rate of little planes, if compared with the big ones.

Now, the American NTSB has discovered that helicopters are dangerous: http://www.flightglobal.com/news/articles/ntsb-adds-helicopters-ga-weather-to-quotmost-wantedquot-394947/  and the explanation could be similar, especially if they address the HEMS activity: Emergency medical services are given an extremely short answering time. That means flying with machines that can be cold at the moment of performing an exigent take-off, for instance, very near to a hospital or populated places where they need to make an almost vertical take-off. Once they are airborne, they need to prepare a landing near to an accident place. The place can have buildings, unmarked electrical wires and, of course, it can be from fully flat at sea level to a high mountain place. Is the helicopter risky or the risk is in the operation?

Of course, precisely because the operation is risky, everything has to be as careful as possible but making statistical comparisons with other operations is not the right approach. Analyze in which phase of the flight accidents happen; if the pilot does not have full freedom to choose the place to land, at least, choose an adequate place for the base. Some accidents happened while doctor onboard was seeing that they were very near to an electrical wire and assumed that the pilot had seen it…all the eyes are welcome, even the non-specialized ones. Other times, non-specialized people asked and pressed for landing in crazy places or rostering and missions are prepared ignoring experience and fatigue issues. That is, there is a lot of work to do in this field but, please, do not use statistical reports to justify that by comparing things that are really hard to compare.

Does CRM work? Some questions about it

Let’s start by clarifying something: CRM is not the same that Human Factors concern. This is a very specific way to channelize this concern in a very specific context, that is, the cockpit…even though, CRM philosophy has been applied to Maintenance through MRM and other fields where a real teamwork is required.

Should we have to improve CRM training or is it not the right way? Do we have to improve indicators quality or should we be more worried about the environment in which these indicators appear?…

An anecdote: A psychologist, working in a kind of jail for teenagers had observed something around the years: The center had a sequence known as «the process», whose resemblance to Kafka work seemed to be more than accidental, and inmates were evaluated according to visible behavior markers included in «the process». Once all the markers in the list appeared, the inmate was set free. The psychologist observed that smarted inmates, not best ones, were the ones able to pass the process because in a very short time were able to exhibit the desired behavior. Of course, once out of the center, they behave as they liked and, if they were caught again, they would exhibit again the required behavior to go out.

Some CRM approaches are very near to this model. The evaluator looks for behavioral markers whose optimum values are kindly offered by evaluated people and, once passed the evaluation, they can behave in agreement with their real drive, whatever it is coincident with the CRM model or not.

Many behaviorist psychologists say that the key is which behavioral markers are selected. They can even argue that this model works in clinical psychology. They are right but, perhaps, they are not fully right and, furthermore, they are wrong in the most relevant part:

We cannot try to use the model from clinical psychology because there is a fundamental flaw: In clinical psychology, the patient goes by himself asking for a solution because his own behavior is felt like a problem. If, through the treatment, the psychologist is able to suppress the undesired behavior, the patient himself will be in charge of making this situation remain. The patient wants to change.

If, instead of speaking about clinical psychology, we focus in undesired behaviors from teamwork perspective, things do not work that way:  Unwanted behaviors for the organization or the team could be highly appreciated by the one who exhibits them. Hence, they can dissappear while they are observed but, if so, it does not mean learning but, perhaps, craftiness from the observed person.

For a real change, three variables have to change at the same time: Competence, Coordination and Commitment. Training is useful if the problem to be solved is about competence. It does not work if the organization does not make a serious effort to avoid contradictory messages and, of course, it is useless if there is not commitment by individuals, that is, if the intention to change is not clear or, simply, it does not exist.

Very often, instead of a real change, solutions appear under the shape of shortcuts. These shortcuts try to subvert the fact that the three variables are required and, furthermore, they are required at the same time. Instead of this, it is easier to look for the symptom, that is, the behavioral marker.

Once a visible marker is available, the problem is redefined: It is not about attitude anymore; it is about improving the marker.  Of course, this is not new and everyone knows that the symptomatic solution does not work. Tavistock consultants use to speak about «snake oil» as an example of useless fluid offered by someone who knows it does not work to any other who knows the same. However, even knowing it, they can buy the snake oil because it satisfies the own interest…for instance, not being accused of inaction about the problem.

The symptomatic solution goes on even in front of full evidence against it. At the end of the day, who sells it makes a profit and who buys it save the face. The next step should be alleging that the solution does not perform at the expected level and, hence, we should improve it.

Once there, some crossed interests make hard changing things for anyone who has something to lose. It is risky telling that «The Emperor is naked».Instead of that, there is a high probability that people will start to praise the new Emperor gown. 

Summarizing, training is useful to change if there is in advance a desire to change. Behavioral markers are useful if they can be observed under conditions where the observed person does not know to be observed. Does CRM meet these conditions? There is an alternative: Showing in a clear and undisputed way that the suggested behavior gets better results than the exhibited by the person to be trained. Again…does CRM meet this condition?

Certainly, we could find behavioral markers that, for deep psychology lovers, are predictive. However, this is a very dangerous road that some people followed in selection processes. This could easily become a kind of witch-hunting. As an anecdote, a recruiter was very proud about his magical question to know a candidate: His magic question was asking for the name of the second wife of Fernando the Catholic. For him, this question could provide him a lot of keys about the normal behavior of the candidate. Surprisingly, these keys dissappeared if the candidate happened to know the right answer.

If behavioral markers have a questionnable value and looking for other behaviors with a remote relation with the required ones, it should be required looking in different places if we want a real CRM instead of pressure to agreement -misunderstood teamwork- or theatrical exercises aimed to provide the desired behavior to the observer.

There is a lot of work to do but, perhaps, in different ways that the ones already stepped:

  1. Recruiting investment: Recruiting cannot be driven only by technical ability since it can be acquired by someone with basic competences. Southwest Airlines is said to have rejected as a pilot a candidate because he addressed in a rude way to a receptionist. Is it a mistake?
  2. Clear messages from Management: Teamwork does not appear with messages like «We’ll have to get along» but having shared goals and respect among tem members avoiding watertight compartments. Are we prizing the «cow-boy», the «hero» or the professional with the guts to make a hard decision using all the capabilities of the team under his command?
  3. CRM Evaluation from practicioners: Anyone can have a bad day but, if on continuous bases, someone is poorly evaluated by those in the same team, something is wrong, whatever could say the observer in the training process. If someone can think that this is against CRM, think twice:Forget for a moment CRM: Do pilots behave in the same way in a simulator exercise under observation and in a real plane?
  4. Building a teamwork environment: If someone feels that his behavior is problematic, there is a giant step to change. If, by the other side, he sees himself as «the boss» and he is delighted to have met himself, there is not way for a real change.

No shortcuts.CRM is a key for air safety improvement but it requires much more than behavioral markers and exercises where observers and observed people seem to be more concerned about looking polite than about solving problems using the full potential of a team.

When Profits and Safety are in different places: An historic approach to Aviation

All of us heard that Aviation is the safest Transportation way. That is basically true but, if 94% of accidents happen while on ground or near to ground, we should think that some flight phases have a risk level to be studied.

 It’s true that some activities bring an intrinsic risk and Safety means balancing acceptable risk level .vs. efficiency. Aviation is in that common situation but it has its own problems: The lack of an external assessment made the safety-related decisions to be inside a little group of manufacturers, regulators and operators. Consumers listen the mantra “Aviation is the safest Transportation way” but they cannot know if some decisions could drive Aviation to leave that privileged position.

 A little summary of technology evolution in the big manufacturers could show how and why some decisions were made and how, in the best possible scenario, these decisions meant losing an opportunity to improve safety level. In the worst one, they should mean a net decrease in safety level:

Once jets appeared, safety increased as a consequence of higher engines reliability. At the same time, navigation improvements like ground-based stations (VOR-DME), inertial systems and, later, GPS appeared too.

However, at the same time that these  and other improvements appeared, like making zero visibility landings possible, some other changes whose contribution could be considered as negative appeared too.

One of the best known cases is the engines number, especially in long haul flights. Decades ago, the standard practice for transoceanic flights was using four engine planes. The only exceptions were DC-10 and Lockheed Tristar with three engines. However, in places like U.S.A., long flights where, if required, planes could land before their planned destination, were performed by big planes with only two engines.

Boeing, one of the main manufacturers, would use this fact to say that engines reliability could allow transoceanic flights with twin planes. Of course, maintaining two engines is cheaper than maintaining four and, then, operators should have a strong incentive to embrace the Boeing position but…can we say that crossing an Ocean with two engines is as safe as doing it with four engines, keeping the remaining parameters constant?

Intuition says that it’s not true but messages trying to oppose this simple fact started to appear. Among them, we can hear that a twin modern plane is safer than a four-engine old plane. Nobody said that, if so, the parameter setting safety level should be how old the plane was. Then, the right option should be…a modern plane with four engines.

 Airbus, the other big manufacturer, complained because at that moment did not have its own twin planes to perform transoceanic flights but, some time after, they would accept this option starting their own twin planes for these long haul flights. This path –complain followed by acceptance and imitation- has been repeated regarding different issues: One of the manufacturers proposes an efficiency improvement, “its” regulator accepts the change asking for some improvements and the other manufacturer keeps complaining until the moment they have a plane that can compete in that scenario.

In the specific case about twin engines, regulators imposed a rule asking the operators to keep a certain distance from airports that could be in their way. That made twin planes design longer routes and, of course, that meant time and fuel expenses. However, since statistical information showed that engines reliability is very high, the time span allowed to fly with only one engine working while loaded with passengers was increasing until the present situation. Now, we have planes that are certified to fly with only one engine working until arriving to the nearest airport…assuming that it could be 5 hours and a half far. Is that safe?

We don’t really know how safe it is. Of course, it is efficient because that means that a twin engine certified in that way can fly virtually through any imaginable route. Statistics say that it’s safe but the big bulk of data about reliability does not come from laboratories but from flying planes and that’s where statistics could fail: Engines reliability makes that big amount of data come from flights where both engines have been working in uneventful flights. We can add that twin planes have more remaining power than four-engine planes due to the exigence that, if an engine fails after a moment during take-off, the plane has to be able to take-off with only one engine. Of course, the four-engine plane has to be able to perform this action with three engines, not with one.

In other words, during cruise time, the engines of a twin plane work in a low effort situation that, of course, can have a favorable impact in reliability. The question that statistical reports could not address because of lack of the right sample should be: Once one engine failed, the remaining one starts to work in a much more exigent situation. Does it keep the same reliability level that it had while both engines were working? Is that reliability enough to guarantee the flight under these conditions for more than 5 hours? Actually, the lack of a definitive answer to this question made the regulators to ask for a condition instead: The remaining engine should not get out of normal parameters while providing all the required power to keep the plane airborne.

At least, we could have some doubts about it but, since the decision was made among “insiders” without any kind of external check, nobody questioned it and, nowadays, the most common practice at boarding a transoceanic flight, is doing it in a twin plane. We will attend to the masks and lifejackets show but it’s unlikely that some could say:

“By the way, the engines in this plane are so reliable that, in the very unlikely event that one of them fails, we can fly with full safety with the remaining one until reaching the nearest airport, no more than 5 hours and a half far”.

How many users are informed about this little detail with they board a plane with the intention of crossing an Ocean? This is only and example because it’s not the only field where improvement followed by complains and acceptance was the common behavior.

Engines number is an issue especially visible –for obvious reasons- but a similar case can be observed in matters like codkpit crewmembers decrease or automation. Right now, there is not a single passengers plane from any of the big manufacturers bringing flight engineer. In this case, Airbus was the innovator in its A310 model and, like in the engines issue, we could ask if removing the flight engineer has made Aviation more or less safe.

Boeing was the one complaining in this case but…it happened to be designing its models 757 and 767 that, in the final configuration, would be launched without a flight engineer.

Is a flight engineer important for safety? Our starting point should be a very easy one: The job of a pilot does not know the concept of “average workload”. It goes from urgencies and stress to boredom and viceversa. In a noneventful flight overflying an Ocean and without traffic problems, there are not many things to do. The plane can fly without a flight engineer and even without pilots. They remain in their place “just-in-case”, that is, in a situation quite similar –with some differences- to the one we can find in a firemen place. However, when things become complex, there is a natural división of tasks: One of the pilots flies the plane while the other one takes care of navigation and communications and, if there is a serious technical problem, they have to try to fix it…it seems that someone is missing.

This absence was very clear in 1998 in Swissair-111, where a cabin smoke situation should make a MD-11, without a flight engineer, crash. In a few moments, they passed from an uneventful flight prepared to cross Atlantic Ocean to a burning hell where they had to land in an unknown airport, to find the place and runways orientation, radio frequencies…while keeping the plane controlled, throwing fuel and trying to know the origin of the fire to extinguish it.

The accident research, performed by “insiders” did not address this issue. Two people cockpit was already considered as a given, even though another almost identical plane –DC10- with flight engineer could have invited them to make the comparison. Of course, nobody can say that having a flight engineer should have saved the plane but the workload that pilots confronted should have been far lower.

This issue was not addressed neither when a plane from Air Transatt landed at Azores islands with both engines stopped. That happened because they were losing fuel and a wrong fuel management made the pilots transfer fuel to the tank that was losing it. Should it have happened if someone had been devoted to analyze carefully fuel flow and how the whole process was working? Perhaps not but this scenario was simply ignored.

Flight engineers dissappeared because automation appeared and that started a new problem: Pilots started to lose skills for manual flying and it drove to a situation named “automation paradox”:

Automation gets an easier user interface but this is a mirage: A cockpit with less controls and cleaner from a visual scope does not mean that the plane is simpler. Actually, it’s a much more complex plane. For instance, every Boeing 747 generation has been decreasing the number of controls in the cockpit. Even though, newer planes are more complex and that’s how the automation paradox works:

Training is centered in interface design instead of internal design. That’s why we find planes more and more complex and users who know less and less about them. A single comparison can be made with Windows systems, almost universal in personal IT. Of course, it allows much more things than the old DOS but…DOS never got blocked. Unlike DOS, Windows is much more powerful but, if blocked, the user does not have available options.

The question should be if we can admit a Windows-like system in an environment where risk is an intrinsic part of the activity. The system allows more things and can be properly managed without being an expert but, if it fails, there are not options for the average user.

“Fly-by-wire” system was introduced by Airbus in commercial Aviation, with the Concorde exception, and it confronted complains from Boeing. We have to say that Boeing had a high experience in fly-by-wire systems because of its military aircrafts. Again, we find a situation where efficiency is bigger even though some pilots complain about facts like losing kinestesic feeling. In a traditional plane, a hand on the controls can be enough to know how the plane is flying and if there is a problem with speed, center of gravity and others. In fly-by-wire planes, by default, this feeling does not exist (Boeing kept it in its planes but, to do so, they had to “craft” the feeling since the controls by themselves don’t not provide it).

 This absence could partially explain some major accidents, labeled “Human Error” or “Lack of Training” without anybody analyzing what features of the design could drive to an error like, for instance, a defective sensor triggering an automatic response without the pilots knowing what’s going on.

 What is the situation right now? If we check the last planes from the big manufacturers, we can get some clues: Boeing 787 .vs. Airbus A350. Both are big twin and long-haul planes, there is not a flight engineer, they are highly automated and they both have fly-by-wire system. Coincidence? Not at all. Through a dynamic of unquestionned changes agreed by insiders and without knowledge by the consumers, the winner will be always the most efficient solution. Then, both manufacturers finished with two models that share a good part of the philosophy. There are differences –electric .vs. hydraulic controls, feeling .vs. no-feeling from the controls, more or less use of composite materials, lithium .vs. traditional batteries…- but the main parameters are the same.

 Issues that were discussed time ago are seen as already decided. The decision always favored the most efficient option, not the safest one. Could that be changed? Of course, but it’s not possible if everything keeps working as an “insiders game” instead of giving clear and transparent information outside.

We should understand too the position of «insiders»: A case like GermanWings was enough for some people -like NYT- to question the plane before knowing what really happened. A few days ago, we had an accident with a big military plane manufactured by Airbus and some people started already to question the safety of a single manufacturer…perhaps someone near to the other one?

Information has to flow freely but, at the same time, many people make a living from scandal and it’s hard to find the right point: Truth and nothing but the truth and, at the same time, deactivate the ones who want to find or manufacture a scandal. Nowadays, the environment is very closed and in that environment efficiency will have always the upper hand…even in cases where it shouldn’t. By the other side, we have to be careful enough to address real problems instead of invented ones. The examples used here can be illustrated not only with the referenced cases but with some others whose mention has been avoided.

Air Safety and low-cost

Low-cost started as an almost marginal issue but, nowadays, some low-cost airlines have outgrown their traditional competitors. Of course, something like that does not happen by chance. A high growth rate for many years usually points to a serious business project. In this context, «serious» means the opposite to a «take the money and run» model, so common in many activities, including Aviation.

 Then, we should start with this fact: There are low-cost operators that did not come looking for easy money. This fact, evident in the behavior of some operators, asks for an analysis where respect is deserved and it is not going to be denied here.

 Once made clear that we do not speak about people looking for easy model, the business model linked to low-cost shows itself as very interesting, not only because of results but because of eventual hidden weaknesses. We’ll center our analysis in safety and potential impact over safety of low-cost practices:

First, a little bit of common sense: If an operator wants to get better prices, costs are the enemy to beat and safety can be translated into costs. Furthermore, since yield-management appeared, it is hard finding two passengers in a plane who have paid the same for their tickets and, whatever traditional operators can tell us, it is a way to sell below costs to beat low-cost operators with not-so-deep pockets. In this environment, differences in prices have to be really important to resist this kind of competition.

Common sense tells us, also, that decreasing costs in safety can be a hard-to-resist temptation. However, this should be a conclusion that requires a deeper analysis:

Low-cost operators are very conscious that this an easy to reach conclusion and, if true, it can damage them very seriously.

When someone notoriously better known than popular like Michael O’Leary, Ryanair CEO was asked for the risks in the Ryanair business model, he was quite explicit: The risk of making something stupid from our side or an accident in an important low-cost operator.

The investigation after an accident can discover inadequate practices in any airline. However, a low-cost operator has a different risk level: Inadequate practice, if discovered, should not be read in terms of negligence or error but as an usual practice to decrease costs and, hence, as a part of their business model.

Hence, low-cost operators are fully conscious that a single major accident can put at risk their business continuity at a bigger level than the one who should suffer traditional operators. They have tried to minimize this risk in different ways and with different success levels:

Public Relations people from low-cost operators tell anyone willing to listen to them that they are controlled under the same rules that every other. Of course, this statement tries to put in the mind of the listener the idea that they have the same safety level that any other. This statement is true but, without entering in the real capacity of rulemakers and inspectors, can be deactivated with a simple example: Rules for car-makers are the same. Does it mean that a Dacia Logan offers the same safety level that an Audi A8 since both share the same rules?

When there is a serious business project and, of course, big low-cost operators have it, safety cannot be reduced to craft ingenious slogans but has to go much further:

Southwest Airlines, still the most copied model among low-cost operators, based cost reduction in a very specific operating objective: 25 minutes from landing to take-off. This objective has to be hard to reach since other operators, like Jet-Blue, decide to leave it looking for the cost reduction in other places.

Southwest based this objective in a very deep knowledge by every single worker about how his activity was affecting others. Without trying the everyone makes everything started by People Express and hard to keep in the long term, Southwest kepts specialization but, at the same time, created an environment based in the ability of the workers performing the job to detect improvement opportunities.

Ryanair trajectory has been much tougher: Time between flights is only one of the ways to reduce costs. Many others, like who pays the uniform of the workers or the invitation to grab ballpoints from the hotel rooms or the price paid by the new recruits for the privilege of working there…Probably, O’Leary himself should not be offended if defined like a CFO that became CEO because since he is fully conscious of that.

O’Leary is so conscious of his importance in Ryanair as financial watchdog that he decided not to attend meetings where maintenance decisions are made. The decision, together with having someone with high technical profile as the Maintenance Head, is positive but it should be quite reasonnable asking ourselves if that is enough. It is extremely hard creating watertight compartments in any organization and, in this case, it seems that they try to create such a watertight compartment to have Maintenance out of pressure looking for cost reduction in any area in the organization.

However, it is easy forgetting that safety is not a function but a perspective covering all the operations in the organization. If an organization is known for a very specific perspective -cost reduction- asking what will happen when both perspectives clash is a must.

As an example, it is possible deciding to have a good spare parts stocks but, in a cost-reduction driven organization…what should happen if a maintenance work is delayed beyond expectations? what should happen if a pilot put more fuel than stricly legal requirements? what if a pilot, already delayed, does not want to speed-up tasks or make checklists faster than usual? We could find hundred of examples of clashing perspectives and, of course, having Maintenance isolated from cost decrease pressure is not enough. If lowering costs is the dominant perspective, that is something that will be over every decision that someone can take as well in a cockpit as in any other position. That, of course, will affect the real safety level that someone can reach.

 Low-cost operators are in the market time enough to be able to make differences among them. Possibly, a soft model, like the Southwest one, centered in cost-reduction in very specific ways, will be less sensible to safety issues than other operators more hard-nosed. These ones will pursue costs wherever they are to exterminate them and this attitude can drive them very often to conflict of perspectives.

For good of all of stake-holders, including passengers, it should be good for the most aggresive companies in their cost-reduction practices  to be able to solve the organizational problem that two conflicting perspectives bring, especially if one of them has always the winning hand.

The challenge will not be easy and it is going to require from very imaginative and energetic operators at least so much imagination and energy as the one they devoted to cost reduction and, perhaps, it will make them change some habits that they see as very important since they were an important part of their success.

We will know if they are successful in this effort or if they will make good the Drucker statement success makes obsolete the factors that made it possible. If so, the factor that could be obsolete even though it drove to the past success is precisely the fundamentalism in cost reduction. Fundamentalism, in this context, should be understood in its most literal meaning: Invasion of fields that are not theirs.

Lessons from 11S about Technology

 

Long time ago, machines started to be stronger and more precise than people. That is not new but…are they smarter too? We can forget developments near to SciFi like artificial intelligence based in quantum computing or interaction among simple agents. Instead, we are going to deal with present technology, its role in an event like 11S and the conclusions that we can get from that.

 

 Let’s start with a piece of information: A first generation B747 plane required three/four people in a cockpit with more than 900 elements. A last generation B747 only requires two pilots and the number of elements inside the cockpit decreased in two thirds. Of course, this has been posible through I.T. introduction and, as a by-product, rhrough automation of tasks that, previously, had to be performed manually. The new plane appears as easier than the old one. However, the amount of tasks that the plane performs now on its own makes it a much more complex machine.

 

 Planes used in 11S could be considered as state-of-the-art planes at that time and this technological level made the fact possible, of course, together with a number of things far from technology. Something like 11S should have been hard with a less advanced plane. Handling old planes is harder and the collaboration of pilots in a mass-murder should have been required. Not an easy task getting the collaboration of someone in his own death under death threat.

 

The solution was making the pilot expendable and that, if the plane is flying, requires another pilot willing to take his own life. How is the training cost for that pilot? In money terms, a $120.000 figure could be more less adjusted if speak about training a professional pilot. However, this could not be hard to get for the people that organized and financed 11S. A barrier harder to pass is the time required for this training. Old planes were very complicated and their handling required a good amount of training to be acquired along several years. Should terrorists be so patient? Could they trust in the commitment of future self-killers along the years?

 

 Both questions could invite the organizers to reject the plans as unfeasible. However, technology played its role in a very easy way: Under normal situations, modern planes are easier to handle and, hence, they can be flown by people less knowledgeable and less expert. Coming from this point, situation appears under a different light: How long it takes for a rookie pilot getting the dexterity required to handle the plane at the level required by the objectives? Facts showed the answer: A technologically advanced passenger plane is easy to handle –at the level required- by a low-experienced pilot after an adaption through simulator training.

 

Let’s go back to the starting question: Machines are stronger and more precise than people. Are they smarter too? We could start discussing the different definitions about intelligence but, anyway, there is something that machines can do: Once a way to solve a problem is defined, that way can be programmed into a machine to get the problem automatically solved once and again. As a consequence, there is displacement of complexity from people to the machine, allowing modern and complex machines to be handled by people less able than former machines with more complex interfaces.

 

 Of course, there is an economic issue here: An important investment in technological design can be recovered if the number of machines sharing the design is high enough. Investment in design is made only once but it can drive to important savings in thousands of pilots training. At this moment, automation paradox appears: Modern designs produce more complex machines with a good part of the tasks automated. Automation makes these machines easier to handle under normal conditions than the previous ones. Hence, less trained people can operate machines that, internally, are very complex. Once complexity is hidden at interface level, less trained people can drive more complex machines and that is the place where automation payback is.

 

 The scaring question is this one: What happens in unforeseen situations and, hence, not included in technological design? If we speak about high risk activities, the manufacturer uses to have two answers to this questions: Redundancy and manual handling. However, both possibilities require a previous condition: The problem has to be identified as such in a clear and visible way. If not or if, even after being identified, the problem appears in a situation where there is not available time, people trained to operate the machine can find that the machine “becomes crazy” without any clue about the causes of the anomalous behavior.

 

 Furthermore, if the operator receives a full training, that is, not only related with interface but related with the knowledge of the principles of internal design, automation could not be justified due to increased training costs. We already know the alternative: The capacity to answer to an unforeseen event is seriously jeopardized. 11S is one of the most dramatic tests about how people with low training can perform tasks that, before, should have required more training. However, this is not an uncommon situation and it is nearer to our daily life than we could suspect.

 

Everytime we have a problem in the phone, an incidence with the Bank, an administrative problema in the gaz or electricity bill…we can start a process calling the Customer Service. How many times, after bouncing from one Department to other, someone tells us that we have to dial the number that we had dialed at the beginning? Hidden under these experiences, there is a technological development model based in complex machines and simple people. Is this a sustainable model? Technological development produce machines harder and harder to understand by their operators. In that way, we make better and better things that we already knew how to do and things that already were hard become harder and harder.

 

 11S was possible, among other things, as a consequence of a technological evolution model. This model is showing to be exhausted and requiring a course change. Rasmussen stated the requirements of this course change under a single condition: Operator has to be able to run cognitively the program that the machine is performing. This condition is not met and, in case of being mandatory, it could erase the economic viability driving to a double challenge: One of them should be technological making technology understandable to users beyond operating level under known conditions and the other one is organizational avoiding the loss of economic advantages..

 

Summarizing, performing better in things that we already performed well and, to do that, performing worse in things that we already were performing poorly is not a valid option. People require answer always, not only when automation and I.T. allow it. Cost is the main driver of the situation. Organizations do not answer unforeseen external events and, even worse, complexity itself can produce events from inside that, of course, do not have an answer neither.

 

 A technological model aimed to make easier the “what” hiding the “why” is limited by its own complexity and it is constraining in terms of human development. For a strictly economic vision, that is good news: We can work with less, less qualified and cheaper people. For a vision more centered in human and organizational development, results are not so clear. By one side, complexity puts a barrier preventing the technological solution of problems produced by technology. By other side, that complexity and the opacity of I.T. make the operators slaves without the opportunity to be freed by learning.

 

Three myths in technology design and HCI: Back to basics

It has been a coincidence driven by the anniversary of Spanair accident but, for a few days, comments about the train accident in Santiago de Compostela and about the Spanair accident appeared together. Both have a  common feature beyond, of course, a high and deadly cost. This feature could be stated like this: «A lapsus cannot drive to a major accident. If it happens, something is wrong in the system as a whole».

The operator -pilot, train driver or whoever- can be responsible if there is negligence or clear violation but a lapsus should be avoided by the environment and, if it is not possible, its consequences should be decreased or nullified by the system. Clearly, it did not happen in any of these cases but…what was the generic problem? There are some myths related to technology development that should be explicity addressed and they are not:

  • First myth: There is not an intrinsic difference between open and closed systems. If a system is labeled as open, that comes only from ignorance and technology development can convert it into a closed one: To be short and clear, a closed system is one where everything can be foreseen and, hence, it is possible to work with explicit instructions or procedures while an open one has different sources of interaction from outside or inside and it makes impossible to foresee all posible disturbances. If we accept the myth as a truth, no knowledge beyond operative level is required from the operator once technology reached the right point to consider a system as closed. Normative approach should be enough since every disturbance can be foreseen.

Kim Vicente, in his Cognitive Work Analysis used a good metaphor to attack this idea: Is it better having specific instructions to arrive to a place or is it better a map? Specific instructions can be optimized but they fail under closed streets, traffic jams and many other situations. A map is not so optimized but it provides resources under unforeseen situations. What if the map is so complex that including it in the training program should be very expensive? What if the operator was used to a roadmap and now he has to learn how to read an aeronautical or topographic map? If the myth works, there is not problem. Closed street and traffic jams do not exist and, if they do, they always happen in specific places that can be foreseen.

  • Second myth: A system where the operator has a passive role can be designed in a way that enables situation awareness. Perhaps to address this myth properly, we should go back to a classic experiment in Psychology:  http://bit.ly/175gKIc where a cat is transporting another one in a cart. Supposedly, the visual learning of both cats should be the same since they have a common information. However, results say that it does not happen. The transporting cat get a much better visual learning than the transported one. We don’t really need the cats nor the experiment no know that. Many of us can go a lot of times to a place while other person is driving. What happens when we are asked to go alone to that place? Probably, we did not learn how to go. If this happens with cats and with many of us…is it reasonnable to believe that the operator will be able to solve an unplanned situation where he has been fully out of the loop? Some designs could be removing continuous feedback features because they are hard and expensive to keep and, supposedly, they add nothing to the system. Time ago, a pilot in a highly automated plane told me: «Before, I was able to drive the plane; now the plane drives me»…this is other way to describe the present situation.
  • Third myth: Availability bias: We are going to do our best with our resources. This can be a common approach by designers: What can we offer with the things that we have or we can develop at a reasonnable cost? Perhaps that is not the right question. Many things that we do in our daily life could be packed in an algorythm and, hence, automated. Are we stealing pieces of situation awareness at doing so? Are we converting the «map» into «instructions» without resources if these instructions cannot be applied? However, for the last decades, designers have been behaving like that: Providing an output under the shape of a light, a screen or a sound it quite easy while handles, hydraulic lines working -and transmitting- pressure and many other mechanical devices are harder and expensive to include.

Perhaps whe should remember again «our cat» and how visual and auditive cues could not be enough. The right question is never about what technology is able to provide but about what is the situation awareness that the operator has at any moment and what are his capabilities and resources to solve an unplanned problem. Once we answer this question, perhaps some surprises could appear. For instance, we could learn that not everything that can be done, has to be done and, by the same token, some things that should be done have not a cheap and reliable technology available. Starting a design trying to provide everything that technology can provide is a mistake and, sometimes, this mistake is subtle enough to pass undetected for years.

Many recent accidents are pointing to these design flaws, not only Spanair and Renfe ones:  Automated pilots that get data from faulty sensors (Turkish Airlines, AF447 or Birgenair) , stick-shakers that can be programmed -instead of behaving as the natural reaction of a plane near to stall- provoking an over-reaction from fatigued pilots (Colgan), indicators where a single value can mean opposite things (Three-Mile Island) and many others.

It’s clear that we live in a technological civilization. That means assuming some risks, even catasthropic ones, like an EPM or a massive solar storm. However, there are other minor and current risks that should be controlled. Having people to solve the problems while, at the same time, we steal them the resources they should need to do that is unrealistic. If, driven by cost-consciousness, we assume that unforeseen situations are below one in thousand million and, hence, this is an acceptable risk, be coherent: Eliminate the human operator. By the other side, if we think that unforeseen situations can appear and have to be managed, we have to provide people with the right means to do so.  Both are valid and legitimate ways to behave. Removing resources -including the ones that allow situation awareness- and, once the unforeseen situation appears, having an operator as a breaker to burn speaking of «lack of training», «inadequate procedure compliance» and other common labels is not a right nor legitimate way. Of course, accidents will happen even if everything is properly done but, at least, the accidents waiting to happen should be removed.

Accidente de Santiago: Declaraciones del maquinista

Tras el accidente de Santiago de Compostela, debo encontrarme entre los pocos españoles que no son especialistas en trenes. Sin embargo, sí estoy suficientemente familiarizado con los factores humanos y con la seguridad para encontrar cosas que me sorprenden en las declaraciones publicadas del maquinista implicado en el accidente:

  • Cuando el juez le pregunta si pueden recorrerse cuatro kilómetros distraído, la respuesta del maquinista es de puro sentido común: “A 200 kms./h. cuatro kilómetros pasan muy deprisa”. Cierto. A 240 kms./h. cuatro kilómetros pasan exactamente en un minuto.

Tras esta respuesta, alguien podría pensar que, si va completamente distraído mientras conduce durante todo un minuto, seguramente acabará fuera de la carretera y tendrá razón: Eso ocurre en la carretera; no ocurre en los trenes, al igual que no ocurre en los barcos o en los aviones y eso nos debería dar una primera pista:

La carretera exige del conductor una mayor atención al entorno y un cierto nivel de actividad debido a la necesidad de seguir el trazado, de anticiparse a situaciones relacionadas con el tráfico, etc. Sin embargo, en otros medios de transporte lo que se exige, salvo en determinados momentos, es mantenerse alerta para supervisar el funcionamiento del vehículo que se va controlando. En aviación, se ha criticado con frecuencia a los sistemas muy automatizados que mantengan al piloto fuera del “loop de control” o, en otros términos, que el avión vuela solo pero, cuando algo ocurre, se requiere la intervención urgente de alguien cuya función era estar ahí “por si acaso”.

Si en algo estamos de acuerdo todos los que, de una u otra manera, trabajamos en factores humanos es en el hecho de que somos unos pésimos supervisores. No estamos “diseñados” para supervisar sino para hacer. Cuando alguien nos pone a supervisar, es inevitable que surjan las distracciones y los fallos de atención; éste es un hecho muy conocido en aviación pero su aplicación no es exclusiva de este ámbito.

  • Cuando el juez le pregunta si existe un sistema de frenado automático, el maquinista responde que en esa zona es él el que frena y no un sistema automático. Al repetirle la pregunta, insistiendo en saber qué es lo que ocurriría si el maquinista no frena en el punto en que debería llega la sorpresa: Si va por encima de 200 kms./h. el tren pondría en marcha todos los dispositivos de frenado automáticamente hasta detenerse pero, si va por debajo de 200 kms./h., no pasa nada. La curva del accidente está limitada a 80 kms./h.¿Sirve de algo una protección automática que sólo funciona cuando se circula a más de 200 kms./h.? La respuesta la tenemos en los periódicos de los días pasados y el video en Youtube.

No hay término medio válido aquí. Si se asume que el maquinista es un mero vigilante, dótese a la línea y al tren de sistemas automáticos que impidan situaciones como la que produjo el accidente de Santiago y asumamos que el maquinista se va a distraer, como lo hacemos todos cuando estamos en tarea de vigilancia porque, una vez más, no estamos hechos para vigilar sino para hacer.

Si se asume que el maquinista tiene un papel más activo, diséñense trenes que exijan ese papel más activo. Los aviones a prueba de fallos de pilotos y los trenes a prueba de fallos de maquinistas también traen sus propios problemas, a veces en forma de falta de realismo sobre cuáles son las cosas que hacemos bien y cuáles las que hacemos mal.

Maquinistas y pilotos: Factores humanos en trenes y aviones

El terrible accidente de tren de Santiago de Compostela ha puesto, una vez más, encima de la mesa el asunto del error humano que, en este caso, se ha revestido desde el principio del concepto de imprudencia.

Sinceramente, siempre desconfío del diagnóstico de “error humano” como causa de un accidente. Existen tanto en aviación como en ferrocarril como en cualquier otro ámbito muchos y poderosos intereses que invitan a utilizar al maquinista o al piloto como una especie de fusible de forma que el sistema, y los que lo dirigen, queden fuera de todo escrutinio.

Cuando alguien decreta que un 90% de los accidentes se producen por fallo humano, cabe pensar si, con ello, quiere evitar que se mire más arriba del maquinista o del piloto o de quien, en última instancia, haya tenido la mala suerte de estar más cerca del accidente.

Dicho esto, y una vez explicados mis motivos para la desconfianza, es cierto que la realidad suele ser muy terca y que las imprudencias existen: Las personas no somos máquinas y podemos decidir en un momento dado comportarnos en forma imprudente. Por supuesto, aunque la imprudencia tenga consecuencias profesionales, penales o incluso se pague con la vida propia y ajena, el análisis no puede detenerse ahí: ¿Es la primera vez o hay indicios de que tal actuación se venía repitiendo en el tiempo? Si había indicios ¿Por qué no se detectaron? ¿Es posible o deseable utilizar protecciones tecnológicas? ¿El lugar o la situación eran intrínsecamente peligrosos y estábamos ante un accidente en busca de la oportunidad para producirse?

La situación que hoy se nos plantea en el tren de Galicia, al menos con lo que conocemos hasta ahora, se produjo hace cinco años en un accidente de aviación: El vuelo 518 de Santa Bárbara Airlines. El avión despegó de una zona montañosa de Colombia, Mérida, para estrellarse minutos después del despegue en territorio venezolano.

La investigación posterior mostró que los pilotos se llevaron el avión al aire en la forma que la mayoría nos llevamos un coche: Girando la llave de contacto y marchándonos. Los pilotos conocían muy bien la zona; la conocían tan bien que despegaron sin dar tiempo a que funcionasen los giróscopos en que se basa el sistema de navegación y se metieron en terreno montañoso, sin visibilidad y sin más referencia que la brújula del avión. El registro de voces en cabina daría lugar a indicios de que no era la primera vez que los pilotos actuaban de esta forma; simplemente, esta vez les salió mal y nadie sabía que estaba ocurriendo y, si alguien lo sabía, una vez ocurrido el accidente optó por callarse para no ser acusado de permitir tal práctica.

¿Imprudencia clara? Sin duda; sin embargo, incluso en este caso, hay que analizar si era detectable, si era evitable y si, una vez cometida, se podían contener sus efectos. Cuando se habla de trenes, siempre hay una ventaja sobre los aviones: Se hace difícil pensar en efectos negativos para la seguridad de cualquier protección tecnológica que detenga o disminuya la velocidad del tren en caso de error o violación del maquinista; esto no ocurre en aviación donde las protecciones tecnológicas asumen que se está produciendo una situación y, si no es ésa la situación o algún sensor falla, la protección puede convertirse en el causante directo del accidente.

¿El trazado es inadecuado? Es posible pero, a la escasa distancia que se encontraba la curva de Santiago de Compostela, si el tren pasaba por ella a 190 kms./h. ¿dónde pensaba frenar?

¿Iba el maquinista descansado o tras varias horas se llevan los raíles y las traviesas clavados en la retina y se tiene una propensión al error que no se tiene al principio de un viaje?

Son muchas preguntas las que hay que responder todavía pero, desde la desconfianza ya apuntada hacia todo diagnóstico que pretenda dar carpetazo con el sello “error humano”, tiene que haber también un reconocimiento de que la imprudencia sin paliativos también existe. ¿Es éste el caso del tren de Santiago de Compostela? Hoy aún no podemos saberlo. Confiemos en que, cuando lleguemos al final de la investigación, los intereses no han prevalecido sobre la verdad. Las víctimas siempre se merecen que se llegue a la verdad, caiga quien caiga, tanto si tiene que ser un maquinista como si tiene que ser un ministro.

Human Factors: Pilot Error

A few days ago, I was watching a documentary about air accidents. The bad part of documentaries is that, if you know in depth an accident, you’ll always find a trivial if not contradictory story. The good part is that you can find some interesting case that you did not previously know about. This was my situation with Crossair 3597 and the analysis of the facts. For good reasons, I did not trust the TV documentary and accessed the official report. However, in this case TV seemed to be right: Researchers went through the whole records of the PIC finding hard to explain why this pilot was flying passengers and writing all of these findings on the official report.

In some places, it is hard or impossible to run this kind of investigation. Certainly, regulators and operators can have the biggest interest in making the pilot appear as guilty of an accident. First, if he did not survive, he will not reply. Second, responsibility for the accident goes a little bit further from them and Third, they do not have too much to change under this kind of outcome since, at last, the problem came from a bad pilot.

Pilots associations, at their time, do not like having the record of a pilot subject to any kind of criticism and try to stop a research that could go beyond current licenses and type ratings. Any other is called «throwing garbage to his memory». Actually, I remember a situation when, after pointing to the low experience of a pilot involved in an accident, someone told me that I could not demonstrate that a more seasoned pilot could have avoided the accident. That was true, as well as the fact that the person who said that could not demonstrate that an expert pilot could not have avoided the accident: Draw.

Pilot error, of course, exists. If the record of a pilot is a clearly inadequate one and it is finished with a deadly accident, the knowledge of this record will trigger questions about who, how and why hired him and how was the training and the checks that every pilot has to pass. Victims, including very often the pilot himself, deserve going through the records without witch-hunting spirit but trying to know and to avoid the same mistake in the future. Perhaps, this is more important than ever: We have MPLs and even pilots that are completing their training in the right seat of some airlines.

The Crossair example is a good one. Any investigation should be driven by the principle of knowing the truth, all the truth and nothing but the truth…and no interest should be above this.

The always missed blessing of automation in Aviation

I found this interesting article about automation in Avweb: http://www.avweb.com/news/features/Automation-Friend-Or-Foe220153-1.html Probably, nobody -including myself- is going to discuss that automation came to remain in Aviation and there is nothing wrong with that.

The problem that can make articles like this one uncomfortable for some of us is a kind of default position where a deeper analysis is missing. Can we say that in the past automation was to support humans and now it is the reverse? Probably that is right since, at this moment, many features of Aviation should be lost if someone insists in people supported by technology. A CATIII landing, for instance, simply cannot be done by a human pilot. The precision required to navigate our crowded skies with decreased distance among planes should be again impossible to keep for a human hand during a long period of time. So, what is the problem?

First, if we accept that humans are going to perform as support for automation, they should have resources -understood in a wide sense- to perform that function. People, to work properly in Aviation and probably in any other field, cannot be subject only to a feedback loop. They have to be able to anticipate what is coming, that is, they need to have information enough to forecast the next minutes or the next hours. If they do not have this kind of situation awareness, they cannot be expected to solve an important problem once it appears.. Is all the automation in the market designed to support a pilot whose head is before the plane or is the pilot expected to remain quiet passing checklists and to intervene only after a problem unmanageable for automation appeared?

That’s the problem of the «default position»: When someone speaks about automation, it is very common forgetting that automation is divided in two big classes: Good and Bad. We cannot assume that automation is always good and all the effort has to come from the human operator who has to adapt to it. Adaptation has to be mutual and that means also getting rid of features that can be error inducing or hide key information. This is not new and when highly-automated planes started to crowd the market, pilots were supposed to adapt to the situation, even if that situation could be shouting about a bad design. Some problems are so old that they already were addressed by people like the deceased designer of MacIntosh, Jef Raskin, but they are still among us.

Automation? OK, but if we want to have an alternative resource to automated systems, we need to keep current knowledge and abilities as the ones required to fly manually a plane and it seems that there is a problem here that was made public for someone as relevant in Aviation as the CAA-UK. Furthermore, automation has to be designed thinking in someone that could be required to take the controls in delicate situations. If a design does not meet these criteria and, even though, once in the market a design is considered as unquestionnable, let’s be coherent: Do not have pilots. If they cannot be that alternative resource, it should be more coherent to fully trust in automation and, if a crash happen, tell the relatives of victims that the probability was extremely low and it probably never will happen again.  Acceptable? Obviously, not.

Nothing to say about the presence and relevance of automation and/or I.T. in Aviation. The only part that could not be acceptable is this default position that makes automation an unquestionnable piece of the environment without stopping to think that, perhaps, automation and Information Technology can also be wrongly designed and, when it happens, it is not fair asking everyone to «adapt» to the creative genius of the designer that, in some cases, could not be creative nor genius and, especially not a designer. Designs, as everything, have to be open to question and, when something shows to be wrong, the «Human Error» or «Lack of Training» labels cannot be used to hide this fact.