“The Facts”, “The Company” and “The Investigation” sections of this essay are based on: “Downfall: The case against Boeing”, Neflix documentary released on February 18, 2022, directed by Rory Kennedy. My only objective in this essay is to learn from this case in order to help companies improving their Risk Management.
The Facts
On Oct. 29,2018, Lion Air Flight 610 took off from Jakarta, Indonesia, and crashed into the ocean a few minutes later killing all 189 people on board. The plane was a brand-new Boeing 737 Max, the weather was good, and -at the moment- nobody understood what had caused the plane to nosedive into the ocean. In the next few hours pilot Bhavye Suneja, the Indonesian authorities, and the airline, were suspected of wrongdoing, but nothing had been concluded at the time.
When the black box was recovered, investigators noticed that the flight pattern of the plane had been erratic and that after takeoff there was a failure of the left-hand angle of attack indicator, a sensor located at the side of the plane cockpit that measures the angle of the nose during the flight.
Andy Pasztor, a Wall Street Journal journalist, started investigating the case. Dennis Muilenburg, Chairman at Boeing, declined to comment on the investigation, but Boeing said that an American pilot would have never got into this kind of situation, and blamed the Indonesian crew for not doing everything they were supposed to do during the emergency.
On Nov. 11, twelve days after the accident, Boeing said it was a MCAS failure. Nobody knew what that meant. After some research, it was only found in the abbreviation section of the plane manual. MCAS stands for “Maneuvering Characteristics Augmentation System » and its a software code designed to pull the plane’s nose down when the angle of attack sensor detects that the plane is going up at an angle too steep that might cause it to stall. Unfortunately, pilots did not know MCAS existed on the plane, so a failure in the angle of attack sensor activated the MCAS and sent the aircraft’s nose down… into the ocean.
A senior executive at Boeing said that they had never informed nor trained the pilots on MCAS because they did not want to overwhelm them with lots of information.
On Nov. 27, 2018, Boeing Executives (with their lobbyists) visited the Allied Pilots Association, the Pilot’s Union, to discuss the matter. They said that they will fix the software and that it will take about six weeks. The pilots asked to ground the planes until then, but Boeing refused saying that there was no conclusive evidence that the accident was caused by this problem.
Boeing’s chairman went on National TV and, when questioned regarding the actions taken after the accident, said that the company pointed the pilots to existing procedures regarding how to handle this kind of problems, and repeated that the 737 Max was a safe plane. Additionally, Boeing announced an increased stock buyback to $20Bn and an increase in dividends by 20%, reaffirming its bullish outlook to the capital markets.
Nineteen weeks after the first crash, on Mar. 10, 2019, the Ethiopian Airlines flight 302, headed for Nairobi, Kenia, crashed shortly after taking off Addis Ababa, killing all the 157 people on board. Again, a brand-new Boeing 737 Max, on a perfectly fair-weather condition flight, nosedived; this time in a semidesert area.
Boeing immediately issued a statement saying that safety was their number one priority, and they had full confidence in the safety of the Max. They also stated that the FAA did not mandate any further action at the time. The FAA said that they had no plans to ground the 737 Max, at least until they had further data. Elaine Chao, of the US Secretary of Transportation, flew on a Boeing 737 Max, and said that if they found evidence that linked a plane failure to the accidents, they would ground it. The Chinese government however, unilaterally decided to ground the Boeing 737 Max, and several other countries followed almost immediately.
Once the black box of the Ethiopian Airlines flight was recovered and analyzed, it showed a similar flight pattern with respect to the Lion Air flight. A few days later, at the accident site, the “jackscrew”, a vital piece of the aircraft that commands the stabilizer trim, was found. It showed that it was set in a “full nose down” position, proving that the MCAS was to blame for the accident… again. Within 30 minutes, President Trump was on National TV, grounding all the Boeing 737 Max. After the announcement, all the grounded planes needed to be flown to their storage locations, but pilots did not want to fly them any longer…
At this point, the Congress became involved, and Rep. Peter Defazio (D-Oregon) started an investigation on the matter. Additionally, the families of the victims started to put pressure on Congress in Washington DC.
After the second crash, Boeing blamed the Ethiopian crew. They said they failed to do everything they were supposed to do. Boeing was very aggressive campaigning in favor of their position that proposed human failure in both accidents. Capt. Chesley (Sully) Sullenberger responded:“We should not be blaming the pilots; we should not expect the pilots to compensate flawed designs”.
Once the second black box was recovered and analyzed it showed that the Ethiopian crew in the cabin acted exactly as Boeing instructed them to do in case of this situation. Unfortunately, this did not work either, because at such a speed, the plane simply could not be trimmed up manually.
This had a severe impact on Boeing’s reputation.
The Company
Over several decades, Boeing had a strong reputation for excellence and safety. Every single employee of the company had a clear understanding that safety and the security of every plane they manufactured was their responsibility, and they felt respected and valued, both by the company and society.
In 1997, when McDonnell Douglas and Boeing merged, this started to fall apart. Harry Stonecipher, CEO of McDonnell Douglas, became the CEO of the Boeing Company, and the main focus was now on financial value and creating returns for shareholders. Everybody had to focus on increasing share prices in Wall Street. Original Boeing’s culture of not taking shortcuts and either doing it right or not doing it, changed to cheaper planes, less controls, lower quality, and faster time to market. The new management started removing quality controls to speed up production, and former employees stated that anyone who reported a problem could easily be fired: the new culture was “kill the messenger ignore the problem”. The company announced large layoffs and focused on doing more with less people. At the time, they also moved the company headquarters to Chicago, IL.
A few years later, Boeing started to have a harsh competition from the European Airbus. In 2003, after several great years for Airbus and bad years for Boeing, Airbus became market leader for the first time. This increased the managerial pressure: it was all about designing and taking the product to the market as fast as possible and beating the competition. Quality and safety issues became less important.
In 2010, with oil prices at record high levels, Airbus introduced the new A320 Neo, the most fuel-efficient plane at the time. The plane was the fastest seller plane in the history of aviation and sent Boeing’s top management in panic; they did not have a plane to compete with the Neo and had no time to design a new one. So, in 2011 the company controversially announced the new Boeing 737 Max, under the claim of being the most efficient plane in the single aisle segment. Essentially, they put a more fuel-efficient engine on the old 737 (a plane first launched in 1967).
To take the plane to the market as fast as possible, it was crucial that this was positioned as new version of the same plane, since this would significantly shorten the approval process by the FAA, and the airlines would not need to incur in the high cost of additional pilot training. Boeing guaranteed airlines that the new plane will not need a new simulator training for pilots to fly it.
The Boeing 737 Max was launched and was a huge success; lots of orders came in and Boeing’s stock price skyrocketed.
The Investigation
Rep. Defazio has always said that he was waiting to receive information from the company to work on official evidence. Boeing was reluctant to send information, but after long negotiations and with the help of the house lawyers, he started receiving documents. Among them was a report showing that in 2013, in the early stages of the project, a group of employees discussed the MCAS and its safety. The problem was that new modern and efficient engines would not easily fit in a 40-year- old plane. The fear was that the weight and location of the new engines would cause the balance of weight to move backwards, so the nose of the plane would rise and take it to a stall. The MCAS was designed to level the plane in such an event.
Additionally, it was reported that the MCAS needed to be modified twice during the production process. In the beginning it only worked at high speed, but then they realized that it needed to also work at lower speeds. This modification required a larger range of movement of the trimmer, which caused more radical nose dives of the plane. The second modification made on the MCAS was that instead of relying on information from two sensors, it relayed on information from just one sensor. The problem with this decision is that if the sensor gets damaged during the flight, it will send faulty information, and the MCAS will take over the plane and throw its nose down.
Nobody at Boeing told the airlines or the regulator about this new system, to avoid going through a lengthy approval process that would delay their time to market, and additional pilot training that would increase the airlines cost of adopting the new plane. As we have seen, pilots needed to be trained to learn how to react in case of an MCAS-related problem, but the company decided not to disclose this. In fact, they were stressing the point that this plane will not need simulator training for pilots, and that if the regulator asked for it, Boeing would not allow it to happen. Paradoxically, in 2017, Lion Air asked Boeing whether they should not get extra training for their pilots for this new plane, and Boeing said there was no need for it, and went as far as mocking Lion Air for asking for it. Internal reports disclosed that Boeing knew the potential failure of the MCAS would have catastrophic effects, but still decided to hide it and not offer training to the pilots.
After the Lion Air crash, the FAA had performed a Transport Aircraft Risk Assessment Methodology (TARAM), concluding that the Max could have crashed fifteen more times during its life, at the rate of one crash every two years. Unfortunately, the study was not disclosed, but Boeing received the results and promised they would fix the problem before a new crash occurred.
Boeing, after learning about TARAM study, and essentially knowing that another Max could crash at any time, did nothing to eliminate the risk of a second crash. They just bet on a second accident not happening before they could find a solution.
In the meantime, public opinion was discussing whether Boeing put an unsafe plane in the air, and if 346 lives is the cost of the company doing “Business as Usual” with a total disrespect for risk.
My Comments
The Boeing 737 Max case was a tragedy. It happened, and we cannot go back in time and change the events. What we can do is study them,and hopefully learn something about risk management, and the reasons why companies fail to do it right. Because, let’s face it, this is a major failure in risk management, and it happened in a large corporation in an industry in which risk should be a serious matter. It is worth noting that Boeing is regulated by the FAA, so there is also an issue with regulation and its ability to enforce an adequate risk management program, both at the design and execution level. There is not an easy way to enforce a good risk management program. Regulators require companies to show they have a program, but it is very difficult to know whether they are managing risks, or just having a risk management program. We know that the only way of doing it properly is going through a cultural change in the whole company, and this must start at the top management level.
Having a real change in the perception of the importance of a company-wide risk management program in this case, would probably be the best way to honor the 346 people who paid with their lives the lack of risk management and planning in companies throughout the world. I will discuss some insights in several short sub-topic discussions.
Company Culture
The former Boeing used to be a company obsessed with quality and safety. A company with several double checks and controls that made sure that they were putting the absolute best in their products. In sum, a company in which safety and security of their planes was embedded in their DNA. This strategy has probably been costly, and during “normal times” the company probably left some money on the table in the form of higher costs, especially according to companies that were mainly focused on the financial bottom line. McDonnell Douglas seemed to be a more aggressive, results oriented and financially driven company. In the clash of cultures, the latter prevailed, most likely because top management came from it, and the new merged company was mainly focused on one stakeholder: the shareholders. The 737 Max accidents happened in 2018 and 2019. The prevailing idea of which stakeholder should a company satisfy had long shifted from shareholders to several stakeholders in the previous three decades. The firm’s culture at the time of the design of the Max was antiquated and unacceptable, especially in an industry that needs a strong focus on safety and security. We know that capital markets are important, but they are not our only stakeholder, and top management cannot fail to understand it. This is a very important lesson to be learned from the Boeing accidents. Company culture matters and matters a lot.
Identifying a Risk is Not Enough, it Needs to be Managed
Boeing’s management knew that there was a risk with the MCAS; the investigation reports showed that its impact could be catastrophic, so it must be the case that they have either severely underestimated the probability of occurrence, or the probability that a potential accident would be related to a design flaw of the plane. It might also have just been bluntly ignored. In any case, nothing was done with respect to this risk; no risk management, no mitigation plan, nothing.
This is unacceptable!
In a world in which a surgeon faces charges if there is a flaw in a medical procedure and causes damage to a patient, how can top management have no responsibility in a case in which 346 people died. In fact, a couple of months after the senate investigation, Dennis Muilenburg, Boeing’s Chairman, was asked to resign and received stock and pensions awards worth $62m. In January 2021, the US Department of Justice charged Boeing with criminal conspiracy to defraud the FAA. The company agreed to pay $2.5Bn in fines and compensation to avoid criminal prosecution.
There are several issues we need to consider. Who was involved in the decision of going ahead with a project that had an unacceptable risk? Was the CEO, or someone in the C-suite, aware of this? Or maybe the decision had been delegated to a lower level in the organization so nobody at the C-Level was aware of the risks. Either way, it would be a serious flaw in risk management. In any case, there is a clear connection between this problem and the tone at the top set by the C-Suite that discouraged employees from raising any problems. We need to make sure we can design risk management mechanisms that allow top management to be fully aware when there is a decision regarding safety, and human lives, and that encourages employees to raise any issue when they consider it important.
So, who is it to blame for these accidents? Obviously, the chairman and CEO is the ultimate responsible, but for sure there is more to learn and discuss about how the process evolved.
These accidents raise another common misconception in risk management. People tend to feel the need to quantify risk probabilities and impact. That is, unfortunately, a very dangerous idea. How would you quantify the cost of an accident in a plane full of people? Some might propose to use the cost of the compensations to the victims, plus the cost of fines and penalties, multiplied by the probability of occurrence. This is completely flawed. In a world in which we are in business to satisfy multiple stakeholders, we simply cannot quantify the cost of a human loss as compensation times probability. It is clear that once the accident happens, the probability is 1, and for the family of the victims the cost of the missing relative is much higher than the compensation cost calculated by the company. So, when we have lives at stake, nobody should try to quantify the risk in terms of the effect on the cash flow. This argument is still true for other types of less dramatic risks, like for example, how do we quantify the impact on reputation, is it even possible? Do we need to quantify it, or can we just say, for example, that any risk defined as high or very high is not acceptable? My suggestion is, do not try to quantify unquantifiable risks. In the search for a quantification, we will necessarily use some assumptions that will, most probably, drive us towards wrong analysis. Just define risks as “very low”, “low”, “medium”, “high” and “very high”, and be clear whether this impact is affecting the cash flow, the ability to operate, reputation, safety, among other aspects of the organization.
Crisis Management
The case shows that Boeing had absolutely no crisis management in place. The way they handled the case was awful, in both accidents. Obviously, good crisis management would have not brought back the 346 lives, but could have mitigated the impact of this crisis in the company’s reputation. It is quite amazing that after the first accident they were not able to do something to improve the crisis management before the second event. If this is true for a large US corporation like Boeing, imagine the crisis management capabilities of smaller firms around the world. Companies need to improve their ability to manage crises, and this has to be done as a part of their risk management program, more specifically, as a mitigation strategy for certain risks in their risk map.
The Regulator
Another interesting issue is related to the role of the regulator. In the beginning they did not investigate enough on the new plane and underestimated the MCAS. After the first accident, they suspected there was a problem with the plane, they studied it with the TARAM, and then they were certain that there was a problem with the plane. However, they still decided to allow the Max to fly, indirectly allowing a second accident, and the loss of additional 157 lives. In my view, there is negligence of the regulator in this case. We should take the opportunity posed by this case to encourage regulators to improve their ability to help regulated companies to design and execute better risk management programs.
Corporate Risk Management
In the documentary (Downfall: the case against Boeing) it is said that Boeing was doing business as usual. This is quite common in most companies. In some sense we can say that top management has a bias toward the avoidance to manage risks. Boeing’s management knew that this could happen, but they (not sure who or at which level) decided that it was not important enough to act on it. At that point in time, taking the product faster to the market was more important than a potential safety issue; not having to retrain the pilots was more important than a potential safety issue. So, they just moved on ignoring risks. This is extremely common, the low probability of occurrence of an accident gives people a strong incentive to neglect risks. Considering how many flights the 737 Max has performed, having had two accidents, in percentage it seems like a small probability, which by the way, is something that Boeing’s CEO said on National TV. In most of the cases, in which the event does not happen, managers who neglect risks easily get away with it; no event, nobody realizes the risk was even there, lower costs and higher profits. The problem is that sometimes, the events do happen, and when this is the case we usually see large material and immaterial losses. Usually after the event happens, we learn in the investigation that the tragedy could have been avoided if everybody had done their job, and this is mostly related to putting the risks on the table and acting. This is called risk management.
When you are in the middle of a situation it is very difficult to make adequate decisions all the time, especially when there are large potential gains at stake. Experienced mountaineers climbing Everest, for example, know that in their last push to the summit, if they do not summit before 1pm, they must go down, otherwise, the probability of not making it to base camp alive increases dramatically. This, which looks as a perfectly rational decision when discussed at a coffee store or in an office, is extremely difficult to accept when you have spent weeks in the expedition, and spent several hundred thousand dollars of your savings to reach it. Especially if the summit seems very close. Therefore, the decision of going down at 1pm, no matter how close to the summit, must be done and agreed on before you start climbing. Once you are there, if you have the chance of not going down and pushing to the top, most probably you will make the irrational decision. Technically, you have higher chances to overestimate your capabilities and even luck, and to underestimate the probability of occurrence and impact of an unwanted event.
Taking this Everest example to Boeing, probably in a rational environment, most of the people involved in the decision would have decided that the MCAS risk should have been disclosed and addressed. Unfortunately, with all the pressure and the sense of urgency, nobody acted rationally, underestimating the probability of occurrence and impact.
This is the reason why risk management policies are so important. A rational policy would have clearly stated, that “anything that has a risk in which the plane might fall causing losses of human lives, should not be neglected”. Unfortunately, if you do not have the policy, then irrationality kicks in and wrong decisions are made with potentially catastrophic results.
Let’s hope this tragedy helps top management and regulators in doing more rational risk management in the future.
Deja tu comentario