Table of Contents
3. Superordinate Issues
- The concept of an “isolated case”
- What are damage-causing failures?
- What does ETOPS mean?
- Redundancies and engine failure
“Isolated cases”, Murphy's Law, and “disimprovement”:
The first time a certain type of damage occurs in an engine or component, the term “isolated case” is frequently used. However, this term should only be applied if the probability of the damage occurring can be classified as “improbable failure conditions” as per JAA 25.1309 (see Figure "Effects of failures"). This kind of rare occurrence can not be assumed to have occurred if the damage was caused by technology or operating conditions that may be present in other engines or components of comparable engine types. Deeming many damages to be isolated cases is not unlike a ship captain who sees a dangerous reef after a long journey and concludes that there are no other dangerous reefs further along the way. Therefore, it is extremely important to determine whether or not the term “isolated case” is not applied simply due to a feeling of helplessness regarding the problem. This can be due to several factors, such as remedies that are not immediately recognizable or that seemingly require an overly large amount of effort relative to the seriousness of the problem.
Many damage mechanisms (e.g. crack corrosion) are caused by a combination of several factors. Experience has shown, however, that Murphy's law often applies to these situations, and the most dangerous combination of factors will occur despite expectations to the contrary. It seems that “apparent redundancies” are especially prone to this type of failure (Example "Special sensitivity for quenching"). Many engineers do not realize that statically determinate designs (e.g. a stool with three legs) often have a functional elegance that is difficult for engineers to resist, but if a component fails, the whole design will fail catastrophically, which is not the case with designs that are not statically determinate (e.g. a stool with five legs).
Damage often occurs only after long incubation times (Fig. "Incubation time"), and is then the first damage of its type. Experience has also shown that it will not take long before several “isolated cases” have occurred.
Knowing the original damage-causing failure is vital in order to classify risks and estimate the probability of damage occurring. However, this can be very difficult if, for example, testing procedures or materials involved do not safely exclude this type of damage (Fig. "Cause of damage").
If improvements or remedies are undertaken that are not based on a sufficiently accurate damage analysis, the situation that is likely to result can be referred to as “disimprovement” (Fig. "Disimprovement"). The same parts will fail, along with others that were not not assumed to be connected with the damage, and the damage symptoms may be quite different. The special sensitivity of turbine engines to “disimprovements” is shown by the large number of examples in which remedies for damage or weak points resulted in new damages and problems. Fig. "Sensitivity to change" is an attempt to explain this phenomenon.
An argument that must be treated with extra caution is that, because other versions of the same engine type do not have certain problems, certain factors can be ruled out as the cause of damage. This is especially true if the affected engine version is an increased-performance “family member” (Fig. "Overload").
The term “damage-causing factor” is used to describe a type of factor, the absence of which would have prevented the damage from occurring or caused it to occur differently. This type of factor must not necessarily be a failure or malfunction. For example, it may be a certain weather condition, which only becomes relevant when the pilot makes a mistake in evaluating the conditions and takes an improper action. Therefore, several damage-causing or contributing factors are likely to combine, and experience has shown this to be true in most cases of damage. It is vital to recognize all important factors and correctly evaluate their contribution to the damage that occurred. Realistic classification of these factors is the highest level of damage analysis and requires a high measure of objectivity, impartiality, self-criticism, and experience in a wide technical field. Frequently, conscious or unconscious constraints lead to mistaken evaluations of the damage factors (e.g. if the engineer brought in to determine the damage contributed to the design of the failed components or the designer does not question his own work). In this case, 5% effects become 90% effects, and important factors are downplayed or seen as being taboo. In order to prevent this type of situation from occurring, it should be ensured that none of the responsible damage investigators selected are emotionally connected with the damaged parts. Many cases have shown that if this rule is not observed, the engineers responsible for the damage process may do things such as classifying what are really minor secondary causes as the major causes of the damage. In turn, this promotes recurring parallel damages over a long period of time, even though a whole list of remedies and improvements were introduced. However, it is also necessary to conduct discussions with a wide range of technicians, especially the affected specialists. If the damage analyst is confident in his duty to critically evaluate the information he is given, he can include the arguments of the affected engineers in his analysis.
Deciding whether or not a discovered failure caused damage (Fig. "Cause of damage") requires a great deal of experience. Many weak points are typical for the technology and unavoidable. Examples of this include micro-cavities in cast turbine disks and blades. While overloads can cause fatigue cracks to spread from these cavities, it would be a mistake to classify them as the cause of damage if they were within the given specifications, and the strength rating for the part was based on values from tests of samples with the same type of weak points.
Figure "Cause of damage": Searching for the “damage-causing failure” is of fundamental importance for preventing failures and designing proper remedies. In this case, the issue is not whether the failure was the main cause or only a contributing factor, but whether it is even an unallowable failure or a weak point characteristic of the material or technology. When designing parts, this type of unavoidable weak point must be properly taken into account with regard to conditions such as operating behavior and life span.
Failures that cannot be safely prevented by the prescribed testing procedures for the series and/or system monitoring must be classified as weak points. A clue that indicates a flaw of a size that can be safely discovered by penetration testing is “technical scribing” (bottom diagram). This is defined as a semi-elliptical surface crack with a length of 0.8 mm and depth of 0.4 mm. A criterion for parts at risk for crack growth may be the threshold (also see Fig. "Incubation time") of the stress concentration at the flaw. Flaws with an effective crack length of ath, whose stress concentration at all relevant operating loads is under the threshold value Kth will not grow further and do not affect the life span of the part. However, this is only true if other damaging influences (e.g. corrosion) do not cause the flaws to grow beyond the threshold value during operation, or the threshold value is not reduced by effects such as embrittlement.
Even if a dynamic crack initiates at a material flaw, this does not mean that the flaw is damage-causing, as long it is within the allowed specifications with regard to the above considerations.
When evaluating causal flaws, mistaken conclusions can easily occur if the investigation is not done with a wide perspective. This kind of situation is to be expected if, for example, the evaluation is based only on a metallographic grind or one SEM image of the damaged part at high magnification without inspection or specification of typical comparable parts that have proven themselves in operation. One example is the material-specific micro-porosity (shrinkage cavities) in cast parts made from superalloys, as are used in almost all modern turbine blades.
Integral cast turbine disks of low-performance engines of the type used in helicopters and APUs are especially affected by micro-porosity in the hub, which is subject to cyclical high-stress. The designer needs to take these specific weak points into account.
If materials or technologies with characteristic flaws are used above their thresholds due to the lack of other viable products, then fracture mechanical inspections are necessary to determine a safe life span. The crack progress phase must be considered for the safe life span. Naturally, a sufficient safe distance from the critical crack length must be maintained. Sufficiently proven and accurate data for the crack progress behavior of the material and parts must be available.
If flaws are discovered after delivery, then risk disclosures under aspect of the “threshold flaw limit” are helpful.
Procedure for evaluating the probability that flaws will result in damage:
Try to answer the following questions:
- Which flaws did the parts have that were used for the rating process? These flaws can beseen as typical and allowable for this type of part.
- Which flaws do the parts have which have been proven in serial operation? If there isenough positive experience with parts with comparable weak points, then it would indicate acceptance of these flaws.
- Which flaws are acceptable in new parts according to regulations such as diagrams and material specifications? What do the overhaul manuals allow for run-in parts?
- Did the damage even result from the flaw in question?
- Which flaws can be discovered with the serially implementable prescribed testing procedures?Flaws below the detectable limit are unavoidable and the part must tolerate them.
Figure "Sensitivity to changes" Turbine engines are more sensitive than other machines to any changes to the proven and authorized configuration. “Improvements” to prevent damages and weak points often lead to new damages on the same parts or new problems in other engine parts (Fig. "Disimprovement"). The above model showing balance conditions (top and middle diagrams) is intended to make this behavior more easily understandable. The ball represents the behavior of the part, its movement from the center is the size of the operating influence, the distance from the edge represents the safety against part failure, and the speed at which the ball rolls is the damage speed.
A part with stable behavior has the ability to counter the load changes with a measure sufficient to ensure stable conditions. This is the case, for example, when a self-repairing effect such as that with thermal fatigue cracks leads to de-stressing and slowing of crack progress. The most frequent type of stable part performance is when a part`s resistance to elastic deformation is great enough to ensure that it reaches an equilibrium without unallowable changes (e.g. plastic deformation, crack initiation, fractures).
A part with indifferent behavior reacts to load changes with an even damage speed. For example, wear processes that occur over a long period.
A part with unstable behavior reacts to small load changes with a self-increasing damage process. An example of this is rubbing that heats up the rotor, causing it to expand and rub more vigorously.
This also includes imbalances that do not result in a balanced state and continue to increase the offset of the rotor.
The middle diagram depicts the situation typical for a highly stressed engine part. It is a combination of indifferent and unstable behavior. The width of the platform represents the overdimension of the part. As long as the ball remains on the platform, it is subject to the designed loads and the “natural” aging process. Since overdimensioning is kept to a minimum in engines, the platform is rather narrow. If the load changes move the part beyond the platform (beyond the tolerable loads), it will result in self-accelerating damage.
The high load levels from various operating factors that are typical and necessary in engines are also the reason why apparently small changes beyond the designed load levels can cause engine parts to fail. Because the size of the overdimension is often unknown, (e.g. design based on experience or approval based on test runs), it is also not known exactly how close to the load limits the engine part is operating. Only during engine operation will it become known, if the changes are acceptable. In this case, as well, “the engine will tell us”. A typical example of this is compressor blading. Seemingly small changes to the blading can lead to increased vibrations and fatigue fractures in the blade leaves.
The transfer of effects from an altered part to other parts is promoted by engine-specific design characteristics (bottom diagram). To save weight, engine parts such as housings have relatively thin walls and are sensitive to vibrations. They are coupled by solid connections across almost the entire length of the engine. The rotors influence one another across the bearing and the gas flow, for example. The gas flow transfers changes (e.g. gas vibrations) in and against the direction of flow and therefore allows different parts to influence one another across large distances.
Figure "Disimprovement" (Example "The engine will tell us"): The problems with remedies and improvements are always ongoing. First, there is the question of the effectiveness of a remedy. In not all cases is a single measure sufficient, but the remedy is comprised of a bundle of different, complementary, and supporting measures. These can, for example, be adjusted to one another so that they show an immediate improvement for a short length of time (temporary solutions) in order to buy time until more extensive, long-term measures can be taken. Typical examples of temporary solutions are frequent inspections with non-destructive testing procedures to find dynamic cracks. These measures then allow further operation, even if the operating intervals are limited by the safe length of the crack growth phase.
If remedies do not lead to a noticeable reduction in damages, it is a clear sign that:
- Important contributing causes and perhaps even main causes were not recognized or were not properly assessed.
- Measures are unsuitable or insufficient.The desire to achieve fast and safe success often leads to several measures being implemented at once.
This presents the danger that the effect of the individual measures on one another is not understood or that the estimation of their contribution to the observed changes is wrong. This increases the risk of unnecessary measures with undesired side effects lead to new problems.
The increase of ineffective remedies and solutions is called “disimprovement”. It not only has little or no effect, but also leads to an increase of the same damages. While the original damages may no longer occur, new types of damage begin to occur in the same parts and/or other areas of the engine.
“Disimprovement” is especially dangerous when there is no sufficiently accurate damage analysis and the measures that are based on it are not or can not be implemented in sufficiently small increments, but rather alterations are made that are not accompanied by the necessary experiential background. These situations are difficult to avoid when engines are being developed with revolutionary new technologies.
Example "The engine will tell us" (Fig. "Disimprovement", Ref. 3-1)
Excerpt: “…program officials have confirmed that increased vibrations from a redesigned, reduced-chord inlet guide vane were responsible for cracking a compressor seal during ground tests of a test engine….
…according to….officials, the reduced-chord vanes were added…to lower minor vibratory stresses on its fourth stage blades-stresses that would have been acceptable……
Recent diagnostic tests found, however, that while the “cut-back” or reduced-chord vanes did reduce stresses at lower vibratory frequencies, they actually increased stresses generated in the higher frequency ranges. These higher stresses caused the seal to crack and pieces to be ingested by an engine that was running on test…..”
Comments: The complex relationships and influence of engine components on one another has led to the conclusion, that only sufficiently realistic engine operation can be used to prove acceptable functioning of an engine part. This view is concentrated in the principle:
“The engine will tell us”
Figure "Overload": In order to ensure optimum fuel consumption, engines are designed and adjusted to exactly fit the required performance levels that they will be used at. The pronounced competition between different manufacturers and operators minimizes the amount of play for excess performance. This is true for commercial engines as well as military ones, although in the latter the competition is with potential enemies on a technical level. This increases the risk of “disimprovements” (Figs. "Sensitivity to change" and "Disimprovement").
In the commercial branch, the desire is to build very similar engines for several aircraft types with
related performance levels. This benefits the operator in terms of logistics, operation, maintenance, and repair. The result are so-called “engine families” , which are intended to cover the largest possible performance range. Since the constructive framework leaves little room for innovation, performance increases must be achieved primarily through increasing the gas temperatures. This leads to the implementation of new technologies such as thermal barriers or single-crystal blades with more complex cooling air structures. Even if these new technologies improve the durability towards high temperatures enough that no shortening of the life span of the parts is to be expected, these parts are more sensitive to changes in the gas temperatures than less highly stressed parts (e.g. a change in temperature distribution). If the thermal barrier fails, e.g. through spalling due to an OOD, the parts will fail much more rapidly. This higher sensitivity can increase the damage rate if, for example, the operator often uses these engines at their upper performance limits.
With military engines, especially, it is often not understandable that the client demands that the performance limits be pushed as early as the design phase. This demand does not give enough regard to the usual experience, which is that aircraft become heavier throughout development. This virtually guarantees serial operation of the engine to be marked by high damage rates, frequent IFSDs, and unnecessary repair costs.
Therefore, it must be strongly recommended that, at least in military engine development, engine performance is sufficiently “overdimensioned” from the beginning.
Example "First stage HPT blade fracture" (Fig. "Overload", Ref. 3-2):
(a transport aircraft) “…..experienced a first stage high pressure turbine (HPT) blade fracture in the number three engine on climb….
…inspections of the remaining power plants and others in the fleet revealed three more `well over HPT1 blade axial crack limits'.
The producer of the engines confirms:
`…..we have an issue with the HPT blades on (this engine Type), which is limited on the higher gross weight (aircraft type)'.
The engine manufacturer:
`…developing a new configuration that should solve the problems'. This includes revising the design of the trailing edge slot, to thicken and strengthen it, and tripling the life of the blade. A new thermal barrier coating will also be added. This was developed to combat an unrelated cracking problem…. (on an other application), traced to oxidation issues caused by the high temperature operating environment.”
If remedies were already developed for similar cases by other operators, at least with regard to the engine part and damage, then it can be assumed that this is a common and typical problem. This may be true primarily for operating conditions that lead to high gas temperatures, such as high outside air temperatures or high-altitude starting locations, for example. The damage indicates that operation of this engine usually occurred very near the limits of the hardware.
Figure "Catastrophic consequences": This diagram describes a typical case of “disimprovement” from Ref. 3-3 that was investigated and published by G. Lange.
After approximately 1000 hours of flight the damage occurred during engine idling on the ground. The compressor disks had burst into many pieces. Comparable damage soon occurred in a second engine.
G. Lange describes the causes of the damage (also see Fig. "Special experiments"):
“Axial compressors made from turbine blade steel X 15 Cr13 were not tempered at 725 °C after hardening as usual, but only at 540°C in order to achieve the highest possible strength. This treatment made the steel especially sensitive to corrosion, since a net of chromium carbide deposited along the former austenite grain boundaries. The disks still reached the prescribed life span because the corrosive medium, moisture from the air flow, was blown away by the high centrifugal force. However, through a seemingly minor constructive change - the turning of a balance ring - the moisture built up in such a way that the disks shattered due to the stress levels, i.e. dynamic crack corrosion.
Measures against disimprovements:
- When evaluating an alteration to an engine, experienced specialists from many different fields should be consulted.
- If damage occurs to parts other than the “improved” ones, remember that something was changed. It must be critically tested, whether the change could have any connection with the new damage. If the connection is unclear, assume that there is a connection.
- Attempt to collect as much experiences as early as possible through trial runs and/or fleet leaders.
- Avoid changes that go beyond your own “experience horizon”. Evolution precedes revolution.
ETOPS and safety (Refs. 3-4 to 3-9)
The acronym ETOPS stands for Extended-range Twin-engine Operations. It refers to regulations regarding the maximum allowable flight duration of twin-engine commercial aircraft after the failure of one engine, i.e. single-engine flight time. An airport suitable for landing must be reachable within this flight time. Therefore, it indirectly refers to the maximum distance to such an emergency landing strip, depending on the engine type. This has a noticeable effect on the choice of flight paths.
The FAA created the first guidelines regarding this problem in the 1950s. These allowed 60 minutes of single-engine flight (60-minute rule) and were based on the comparatively poor reliability of piston engine aircraft of the 1940s. In 1985 the FAA prescribed new procedures that allowed approved engine/nacelle combinations to exceed the 60 minute limit on single-engine flight.
This change in regulations assumes that the total reliability of affected twin-engine commercial aircraft corresponds to that of previous ones, and that therefore the risk of both engines failing is acceptably low for all possible operational and constructive causes. This also includes the idea that in case of a single engine failing, the operating behavior and reliability (see Fig. "IFSD") of the aircraft and the remaining second engine are sufficient to ensure a safe landing on a suitable runway.
Because these conditions are also dependent on the quality of maintenance and overhauls, an evaluation of the maintenance programs recommended by manufacturers and approved by operators is being undertaken. This includes the work load of the crew and the procedures in case of damage and malfunctions. The operators must provide suitable training programs to this end. This review also requires evaluation of system redundancies with the least possible changes to the delivered configuration.
The evaluation of the engines is described in regulation CAP 513. This is also the basis of the safety requirement, that the probability of a catastrophic accident due to failure of both engines must be below 0.3×10-8 per flight hour.
However, it must be made clear that the failure of both engines must not necessarily be caused by separate incidents for each engine, independent of one another, but that there are other possibilities:
- Fragments escaping from one engine and damaging the other
- Extreme weather conditions (e.g. ice strikes, hail, and rain - see Chapter 5.1) that affect both engines simultaneously
- Fuel shortages that affect both engines (Example "Engine shutdown mistake I", "Engine shutdown mistake II", "Engine separation", "Fuel shortage", "Misdiagnoses", and "Fuel starvation I")
- Systematic maintenance errors in both engines (Ref. 3-4)
Most of these causes of engine failures are not special ETOPS problems. The ICAO defines the acronym IFSD ( engine inflight shutdown rate = shut-down rate per flight hour) as the measure for the single-engine flight times of the ETOPS ratings (Fig. "IFSD"). In order to keep the probability that both engines fail simultaneously below 0.3 x 10-8 per flight hour, the IFSD rate for 180 minutes of detour time was set below 0.02 per 1000 flight hours ( 0.2 x 10-4 per flight hour). It must be proven to the rating agency that the IFSD rates are met with the measures that were given by the evaluation (Fig. "Operating experience").
The evaluation includes:
- Maintenance programs
- Programs for monitoring the engines on the wings (health monitoring)
- Speed and completeness in implementing instructions for maintenance and operation (service bulletins)
- All activities that make it possible for the operator to ensure reliability
This requires the compilation of many data, some of which are:
- All engine shut-downs, both in flight and on the ground. This excepts only those during the normal training procedures
- The mean time between the failure of engine components that are tangent to reliability (MTBF)
- All incidents in which the desired amount of thrust was not obtained
- Total run times and cycles of the engines and the aircraft.
It must be retroactively (during the flight operation that has already occurred without ETOPS) verified that the “world fleet” of the affected aircraft/engine combination for which approval is being sought reaches a sufficiently low and believable IFSD rate. Naturally, this evaluation also includes all other relevant systems of the aircraft (e.g. hydraulics, steering, cooling, navigation, cabin pressure, etc.).
Special attention is given to auxiliary power units (APUs). It must be shown, for example, that even after the failure of one or two power sources the unit can be started at any altitude without delay.
In 1995 the FAA and JAA accepted an accelerated ETOPS approval (Fig. "Operating experience", Ref. 3-7), which permits new twin-engine commercial aircraft to fly ETOPS routes from the very beginning of their commercial operation. Before operation, the operator of the aircraft must prove to the responsible rating agencies that all necessary ETOPS processes have been implemented in order to ensure a maximum time of 180 flight minutes. The B777 with P&W engines was the first to receive the 180-minute ETOPS approval rating from the FAA. It is interesting to note here that aircraft with four engines have at least as many engine-related incidents that required rerouting of the aircraft as did comparable twin-engine aircraft.
Figure "IFSD" (Refs. 3-7 and 3-10): IFSD rates for a specific ETOPS time are determined with regard to the requirement that the probability of a catastrophic accident due to total thrust loss is less than 3×10-9 per flight hour (Ref. 3-10, Fig. "Effects of failures"). In the past, it was necessary that there was sufficient operating experience under commercial conditions before authorities would give twin-engine commercial aircraft ETOPS approval. The basis for this approval was the IFSD rate for a fleet with the same aircraft/engine combination, i.e. the total number of IFSDs over the sum total of the flight time (bottom diagram). The authorities (CAA in this case) then laid down a relationship between the IFSD rate (engine reliability) and the maximum rerouting time (top diagram). Further important criteria are the reliability and redundancy of main systems and the experience of the airline that is seeking ETOPS approval. Operating experiences in the early 1990s (bottom diagram) showed that future modern engines would have sufficiently low IFSD rates to justify 180 minute ETOPS ratings before they even begin commercial operation. This was accomplished with the introduction of new, twin-engine, large capacity aircraft in the second half of the 1990s, the construction of which took into account previous experiences from design and testing. For example, redundancies and reliability were further improved, and testing came closer to simulating real operating conditions. This meant that the preparation time before the first commercial use of an aircraft type became relatively long (Fig. "Operating experience"), but on the other hand it would already be given a 180 minute ETOPS rating when it began operation.
Figure "IFSD statistic" (Ref. 3-24): As shown in the left diagram, engine aggregate components such as gearings, oil and fuel system components, generators, etc. make up over 50% of the causes for IFSDs. This is based on data from engines from various manufacturers. Evidently, the oil systems are especially sensitive. Another point worth noting is the high rate of shut-downs due to false alarms and indicator malfunctions (false fire warnings and instrument malfunctions). The fuel system also creates more IFSD problems than, for example, the turbine, which as a highly stressed hot part would seem to be more likely to fail.
Figure "ETOPS strategy" (Ref. 3-7): In order to obtain and keep an ETOPS rating, the operator of an certain aircraft/engine fleet must make a significant effort (top diagram). This includes, for the entire aircraft:
- Overnight Maintenance
- Airplane Servicing
- Daily 1st Flight Check
- ETOPS Predeparture Check
- ETOPS Dispatch
These must be accomplished at the airports along the routes.
During the validation program (see Fig. "Operating experience") the aircraft in the operator`s configuration is put through a multitude of flight cycles, altitudes, weather conditions, and operating scenarios to simulate one year of serial operation. This also includes, as mentioned above, maintenance and service work, which is also an opportunity to check and approve the work procedures given in the operator`s handbooks.
In the last phase of the validation program, the operating airline has the chance to operate the aircraft in its flight network. This allows it to demonstrate problem-free operation in the “commercial environment”, e.g. with the corresponding service and maintenance work.
The bottom diagram shows the advantages of longer ETOPS times on a trans-Atlantic route between two airports.
Figure "Operating experience" (Refs. 3-7): The certification work for a 180 minute ETOPS rating takes four years until the aircraft sees commercial use (bottom diagram). Additional steps (dark background) have been added into the certification procedure. The expanded list of tasks includes, for example, the definition of basic requirements as well as development and component testing that includes previous operating experiences (with aircraft/engine-types that are already in serial operation).
The procedure is as follows:
- Analysis and evaluation of relevant operating problems
- Constructive changes as corrective measures
- Verifying and securing success of the corrective measures on a theoretical basis
- Developing realistic tests to verify the success of the corrective measures
The first step of the process is shown in the top part of the diagram (analysis of relevant operating problems):
In this case, more than two million flights of a modern twin-engine commercial aircraft were evaluated with regard to ETOPS-relevant incidents, and this experience was then incorporated into design improvements. It is interesting that 30% of the problems were connected to the engines. A closer look reveals that a notable percentage (see Figs. "IFSD statistic" and "ETOPS certification") could be traced back to false warnings from the oil system.
A typical test program for engines is described in Ref. 3-23:
- 3000 test cycles with the complete engine and interspersed tests with imbalances of the high-pressure and low-pressure sections.-A second engine underwent 1000 cycles of imbalance tests on the airplane for tests in flight.
- The third engine was “aged” for 2000 cycles and then test flown on the aircraft.
- The fourth engine test on a testing rig ended after 2000 simulated flight cycles and three full 180 minute ETOPS rerouting cycles (this engine was later overhauled and went into operation for the airline).
All testing rig engines were made available to the validation authority for inspection. The fleet leader engine (the engine with the longest run time in the fleet) with regard to cycles remained on the testing rig and further cycles were run to ensure that it stayed ahead of all the other engines in the fleet in serial operation.
Ref. 3-23 describes the damages and problems of an engine type within the first two years of serial operation, as well as the implemented corrective measures:
- The front bearing damage in the medium-pressure compressor led to a different configuration of the bearings (implemented in 89% of engines).
- Coking up of the bearing chamber of the high/medium-pressure shaft led to changed inspection procedures, a new type of lubricant, and new insulated oil pipes (implemented in 100% of engines).
- Rotor blade failures in the high-pressure compressor were corrected by using a new blade version and changing the material they were made from (implemented in 100% of engines).
- Oil leaks from the drive system of the auxiliary gearing led to strengthening of the components (corrected in 100% of engines).
- Problems with the drive system necessitated better weighting of the shaft (implemented in 100% of engines).
Figure "ETOPS certification" (Ref. 3-7): This diagram shows the distribution of damage and problems within the two million flights of the engines of a modern twin-engine commercial aircraft type (see Fig. "Operating experience", top diagram). This experience is intended for use with ETOPS validation. In the middle chart describes the affected engine components/component groups, and the bottom chart divides up the causes of the problems. It is interesting to note the high proportion of false signals from the many monitoring systems for engine functions and surrounding conditions.
The data at top describe actual incidents that can be classified as “false oil system warnings”, as well as the constructive measures taken to correct this.
The high percentage of maintenance- and service-related engine damages (22%, bottom chart) shows how multi-faceted the problems can be. On the surface, one might assume that this is due to a weakness on the part of the operator, but it is also possible that poor engine accessibility for maintenance work may be an important factor for the large number of problems.
In the case at hand, many of the IFSDs were due to a poor seal at the connection of a certain oil line, which decreased oil pressure considerably and/or caused oil supply problems. This line connection loosened during operation. However, even after the part was tightened on the ground, it often happened that the poor seal would again cause an IFSD. This type of problem can not always be solved by mere redesigning of the coupling. If, for example, the inaccessibility of the seal causes mounting problems, then constructive changes to the seal should perhaps be accompanied by relocation of the seal to a more accessible location. A thorough analysis of the problem is necessary to recognize this type of relationship. Analysis also includes consideration of whether the maintenance work can and will be conducted under poor weather conditions (rain, snow, cold).
The number of engines and redundancies:
In this situation, redundancy refers to the taking over of tasks by independent systems. Unlike problems in many other non-redundant technical systems, problems and failures of engines in single-engine aircraft understandably almost always lead to accidents. Redundancy requirements are contained in the relevant regulations (e.g. ICAO Standards, FAR 121 chs. 1 to 37, FAR 135 chs. 1 to 8, NfL guidelines, MIL regulations).
The use of two engines provides a constant or parallel redundancy, which is different from a reserve or standby redundancy. Redundancies improve reliability. While the use of two engines can reduce the number of cases with complete thrust losses (both engines fail simultaneously) considerably, the probability of IFSDs per hour (Fig. "IFSD") may increase. The combination of improved engine quality (less failures) and decreasing operating times is theoretically making the reliability gap between single- and twin-engine aircraft disappear.
Depending on the philosophy of the operator, there are advantages to both single- and twin-engine configurations for military aircraft. If the aircraft is only intended for use in battle, with a short expected life span, then a single sufficiently reliable engine may be more cost-effective and weigh less. If the aircraft is intended for use during peacetime with long operating times for pilot training, however, then a twin-engine version will be safer and more cost effective. Of course, the twin-engine version would not mean a doubling of the safety levels over the single-engine aircraft, since two engines are a complex system and only a fraction of aircraft accidents can be traced back to engine problems.
However, data from the USAF show that great improvements can be realized. The accident rate of single-engine aircraft was roughly twice that of twin-engine aircraft, even though engine problems were the cause of only about 30% of accidents. The improvement was attributed to factors such as an increased confidence in the system on the part of the pilots, so that the cause “human error” is considerably less frequent in twin-engine military aircraft. Evidently, ejection from the aircraft is not considered as frequently as in single-engine aircraft. These observations were also verified by the accident scenario of single-engine aircraft types such as the F 104 (Starfighter) and Fiat G 91.
Turbine aircraft engines are some of the most reliable machines despite the high stressing of their components (Fig. "Level of safety"). It would be a mistake to put this safety standard into question. Rather, this high standard of safety must be maintained and increased. This should be done since the increase in air traffic (Fig. "Inflight shutdown rate") will lead to an increase in the absolute number of damages even if the damage rate remains unchanged at its current low levels. One possibility for minimizing damages that is gaining in acceptance is early detection of problems (Fig. "Early detection") during maintenance and operation. Statistics can be very helpful to more surely understand where special effort will be required in the future (Figs. "Damage statistics", Engine-related accidents and incidents" and "Rotor damages").
Along with the “classic” problem areas such as the hot parts, over the years new problem zones have come up as engine development proceeded (Fig. "Chronological view"), only to be superseded by even newer ones as technology advanced.
Figure "Double engine failure": This diagram is based on Ref. 3-13.
A complex technical system such as an engine consists of many components that can be subject to completely different failure mechanisms. This means that the individual failure probabilities must be extremely low in order to keep the failure probability of the whole system acceptably low. This often results in technical limits being reached or unbearable costs and effort being required to ensure safety
from failures. Redundancies are one possibility of achieving the desired failure safety of this type of system.
“Redundancy is the tolerance of a system to the failure of its elements”.
In the general sense, any system that can continue to function sufficiently despite the failure of one or more of its parts can be referred to as redundant.
Redundant behavior can be achieved in many different ways. One possibility is a second, sufficiently different element that takes over the function of the first in case of failure. Both of the elements in the redundancy must be completely different technically and functionally, and they must not be unallowably indirectly connected in order to prevent them being affected by the same failure mechanisms and failing simultaneously (middle diagram). A true redundancy can reduce the failure probability of a system by several decimal powers. Experience has shown, however, that such improvements can often not be realized. Optimal redundancy can only be achieved if there is complete understanding of system characteristics and the influences that could affect these characteristics. This is necessary in order to be able to take suitable measures.
Definitions of terms:
Primary failures are caused by the spontaneous failure of an element of the system (therefore, only by damage in the element itself).
A functional failure of an element is any failure that prevents the element from performing the function it is intended to. Functional failures are caused by internal failures of the element or due to input failures of another element.
The statistical estimations of the damage processes in the diagram presuppose allowable simplifications (see Ref. 3-13).
An unrelated failure pair is when the failure of one engine is not connected with the failure of the other, and the failure of a single element did not cause both engines to fail. Other cases can be referred to as related failure pairs, which can be divided into two categories:
Cascading/consequential failure pairs and common–external-cause failure pairs. A common cause is taken to mean a case in which an external influence causes elements of redundant systems to fail.
Example of a an unrelated failure pair with ideal redundancy (top diagram):
The case concerns a primary failure of one engine. Damage to the second engine is just as probable as in the first. The probability of failure of the first engine during the two-hour flight in question is 4/108. Typical primary failures are fatigue damages such as dynamic fractures or pitting on roller bearings.
In reality, there is no ideal redundancy in case of engine failures. The lost thrust may require increased output from the remaining engine (e.g. in order to maintain the required speed/time until reaching the next airport) or occur during the start phase, when a high total thrust is absolutely necessary. This makes it questionable, whether the failure rate for the remaining engine(s) remains unchanged.
Example of a consequential failure pair (middle diagram): In this case, the failure of one engine directly affects the other.
A typical case is uncontained rotor fragments escaping from an engine. This type of incident primarily endangers military aircraft in which the two engines are close together. In commercial aircraft, the danger is primarily to engines that are mounted together on both sides of the airplane in “double nacelles”. Assuming that in every hundredth case the second engine would be dangerously damaged by the first, the probability of a total loss of thrust would be 4/106, which is about one hundred times greater than the probability with unrelated failure pairs.
An example of a failure pair with a common external cause (bottom diagram):
Typical examples include weather conditions (see Chapters 5.1 and 5.2) such as hail and ice formation. A total fuel shortage might also be thinkable. In this case, as well, the probability of a total loss of thrust is relatively high at 2/106.
Engines in twin-engine fighter aircraft that are connected by a gear box have the characteristics of a consequential failure pair.
In isolated cases (Example "Fuel starvation VI"), even on multi-jet transport aircraft, failed engines dangerously damaged other engines.
Example "Fan failure" (Ref. 3-26):
Excerpt: ”…Information about the damage caused to the aircraft by the fan failures is somewhat conflicting. In the … incident the No 1 engine fan left the pod, traveled forward, under the fuselage and then aft, striking the No 3 nacelle. Damage was also caused to windows, the wing leading edge and other parts of the airframe. Both incidents involved wing-mounted engines.“
Comments: The damage occurred on a large three-engine commercial aircraft. It was fan disk damage due to material-specific weak points in the early 1970s. In the above case, the bladed fan disk evidently separated from the engine and the stored rotation energy moved it ahead of the engine (compare Example "Power-turbine retention system", Fig. "Detached fan rotors"), tilted it, and slid it under the aircraft to the other side, where more blades were thrown free. These blades then damaged the second engine.
Figure "Engine shut down": When an engine on a multi-engine aircraft is damaged, then experience has shown that the wrong engine may be shut down. This can be due to the high psychological stress on the pilot, but is often due to malfunctions, wrong connections (e.g. sensor connections of the fire warning system), or problems with indicators or warning lights.
Example "Engine shutdown mistake I" (Ref. 3-14):
Excerpt: ”…I noticed a slight shaking of on engine…during flight with the Me 262. However, I was unable to tell from the instruments, which engine was shaking. I shut-down the right engine, but the shaking continued, which meant I had got the wrong engine.“
Example "Engine shutdown mistake II" (Refs. 3-5 and 3-6):
Excerpt: ”…the …aircraft crashed… during an attempted emergency landing…The crew attempted the single-engine landing after reporting a fire warning in one of the aircraft's two….engines. The crew said they had received a fire warning in the No.2, or starboard, engine and had shut it down. However post-crash inspections revealed evidence of a fire in the No.1 port engine, but none in the shut-down engine.“
Ref. 3-25 explains the cause of this failure:
Excerpt: ” …investigators are seeking to determine if the vibration indicators are subject to providing false or inaccurate data or if they are so located that they could be easily misread or confused with other instruments in the same area.“
The comments from an older airline captain are interesting:
Excerpt (Ref. 3-5): ”…experienced flight crews were unlikely to be fooled by a miswired warning system. First of all, he noted, the fire warning system on an aircraft is a no-go system. You check it before every flight and if it isn't working properly, you don't take off.
Secondly, you don't shut down an engine only on the basis of a fire warning light. You first cross-check the other engine gauges and see if the EGT (exhaust gas temperature) is rising or if the oil temperature is up or the engine rpm's are down. There's always the chance of a false fire warning.“
The damage made itself known in the cockpit during flight through fire and vibration warnings. The flight accident investigation provided evidence that a fan blade had broken in the no. 1 engine. The laboratory tests showed that it was a dynamic fatigue fracture that had started on the pressure side of the blade.
The inspection of the no. 2 engine did not reveal any abnormalities that could have caused powerful vibrations during flight.
The damage symptoms of the no.2 engine were typical for very low RPM (see Chapter 4.2, Fig. "Large fan postcrash") of both rotors in the two-shaft system at the time of impact.
Engine no. 1 showed signs of a fire before impact, which no. 2 did not (see Fig. "Soot coating").
This made it clear that the damage during flight had occurred in the no. 1 engine, but the no.2 engine was mistakenly shut down.
Example "Engine separation" (Refs. 3-5 and 3-6):
Excerpt: ”…the right engine separated…after takeoff from runway … and fell to the ground 800 ft. past the end of the runway. The pilot followed engine-out procedures and had gained an altitude of 1000 ft. He was notified by the tower that the engine was “lost”, according to a tape recording of tower communications. But he and the first officer did not know the engine had separated until they landed the aircraft on Runway…
…fatigue failure of a rear cone bolt used in engine mounting is the focal point of an investigation..“
This example indicates the amount of stress the crew of an aircraft can be under in an emergency situation. This evidently led to the separation of one engine being taken for a mere engine failure.
Figure "Fuel management": Fuel management in modern transport aircraft is an extremely complex process and serves to ensure an optimal center of gravity for the aircraft. If regulations are observed, then a general lack of fuel is not possible. Example "Fuel shortage" shows that this is not always guaranteed. A relatively complex system such as this one is evidently difficult to understand even during malfunctions and can lead to a lack of fuel in the engines (see examples 3-7and 3-8).
Example "Fuel shortage" (Ref. 3-17):
Excerpt: ”…the flightcrew reported that the aircraft could not reach the alternate airport, the aircraft experienced a loss of power to all 4 engines and crashed approximately 16 miles from the airport.
The failure of the flightcrew to adequately manage the airplane's fuel load, and their failure to communicate an emergency fuel situation to aircraft traffic control before fuel exhaustion occurred, contributing to the accident was the flight crew's failure to use an airline operational control dispatch system to assist them during the international flight…“
Example "Misdiagnoses" (Ref. 3-18):
Excerpt: ”…indications of uneven fuel flow from main tanks 2 and 3….beginning about 4 hours after departure on an 11-hour transpacific flight. The fuel system had been set up for all engine feed from the No's 2 and 3 tanks. Although it was later established that the number 2 crossfeed valve failed in the closed position….. Fuel system problem was misdiagnosed as a problem of faulty fuel gage indications. Fuel monitoring indicated insufficient fuel flow from no 2 tank when crossfeeding. Engines 1,3 and 4 flamed out when fuel was expended from all tanks except no 2. All 3 flightcrews qualified in the (aircraft) in the 13-months before the incident.“
Comments: This incident occurred in 1988. Evidently even modern fuel management systems are not safe from loss of fuel feed if the crew is not sufficiently comfortable with the complicated systems.
Example "Fuel starvation I" (Ref. 3-19):
Excerpt: ”…During take off initial climb, the airplane experienced a loss of engine power on both engines due to fuel starvation. During the landing, the airplane struck numerous trees. Examination of the aircraft revealed that only the left propeller switch was in the feather position and the fuel selector valve was in the drop tank position. Drop tanks were not installed on the airplane. ….warm up, takeoff and the first 15 minutes of flight should be made with fuel selectors in the reserve position. The reserve tanks did not have any evidence of fuel.“
Comments: The literature indicates that a contributing factor for this incident was the insufficient experience of the pilot with this aircraft type.
Example "Fuel starvation II" (Ref. 3-20):
Excerpt: ”…emergency landing after engine failure due to fuel shortage.“
Example "Fuel starvation III" (Ref. 3-21):
Excerpt: ”…after 3 failed approaches in a snow storm…engine failure due to fuel shortage…and the aircraft crashed.“
Example "Fuel starvation VI" (Ref. 3-22):
Excerpt: ”…after a roll, engine failure due to a problem in the fuel system and crashed.“
Comments: These three cases are examples of incidents with a type of single-jet fighter aircraft. It can be assumed that technical reasons such as the failure of a fuel pump, as well as pilot error (e.g. did not recognize low fuel reserves), led to the fuel shortages.
3-1 “Root Cause Found For F119 Seal Failure”, periodical “Aviation Week & Space Technology” April 27, 1998, page 38.
3-2 G. Norris, “CFMI tackles A340 engine cracks”, periodical “Flight International”, 24-30 June 1998, page 6.
3-3 G. Lange, “Zerstörung von Hubschrauberturbinen durch Einsatz eines Stahls in korrosionsanfälligem Zustand bei gleichzeitig nicht werkstoffgerechter Konstruktion”, “Zeitschrift für Werkstofftechnik/ J.of Materials Technology”, Volume 5 1974 / Nr.1, pages 9-13.
3-4 “NTSB Cites Maintenance Work in L-1011 Triple-Engine Failure”, periodical “Aviation Week & Space Technology”, May 16, 1983, pages 29, 30.
3-5 D.A. Brown, “Crash of 737-400 Prompts Stricter CFM56 Engine Checks”, periodical “Aviation Week & Space Technology”, January 16, 1989, pages 60,61.
3-6 “Investigators Study Blade Fracture's Role in 737 Crash””, periodical “Aviation Week & Space Technology”, February 20, 1989, page 31
3-7 D.E. Sayre, L.Q. Anderson, “ETOPS and Service Ready Standards and Processes”, SAE-Paper 921919, presented at Aerotech 92, Anaheim, California, October 5-8, 1992.
3-8 T. Hardeman, “A Twin and a Prayer”, periodical “Aerospace”, July 1989, pages 15 and 16.
3-9 GIFAS, “Airworthiness and Civil Aeronautical Products”, Le Bulletin, September 14, 1989, No 1493-43rd Year (from annual AECMA-meeting).
3-10 “Two legs good, four legs bad?”, periodical “Aircraft Technology Engineering& Maintenance”, Dec 1998/Jan 1999, pages 10-17.
3-11 Vereinigung Cockpit e.V., (Lerchesbergring 24, 6000 Frankfurt 70) “Cockpit Report” 25/18.8.86 “34 Fragen zum Thema ETOPS”.
3-12 B.D. Elsler. “JAA's ETOPS Proposal” periodical “Business & Commercial Aviation”, February 1997, pages 48-51.
3-13 T.W. Yellman, “Redundancy Killers” , Proceedings of the 1998 Advances in Aviation Safety Conference P-321, pages 33-42 (SAE-Paper 98 1204).
3-14 “ CFM56 replacements”, periodical “Aerospace International”, April 1998, page 8.
3-15 “NTSB Investigating Loss of Engine”, periodical “Aviation Week & Space Technology”, December 14, 1987, page 64.
3-16 “Investigation of Boeing 737 Engine Separation Focuses on Failure of Rear Bolt Cone”, periodical “Aviation Week & Space Technology”, January 30, 1989, page 71.
3-17 NTSB Identification: DCA90MA019, microfiche 39506A, Jan 1990.
3-18 NTSB Identification DCA881A056, microfiche 40452A, May 1988.
3-19 NTSB Identification FTW94LA184, May 1994.
3-20 G. Fischbach, “916 Deutsche F-104 Starfighter, Ihre Bau- und Lebensgeschichten”, page 585.
3-21 G. Fischbach, “916 Deutsche F-104 Starfighter, Ihre Bau- und Lebensgeschichten”, page 602.
3-22 G. Fischbach, “916 Deutsche F-104 Starfighter, Ihre Bau- und Lebensgeschichten”, page 391.
3-23 “Promising early signs for the Trent 800”, “Aircraft Technology Engineering & Maintenance-engine Yearbook 1999”, pages 72-75.
3-24 M.Miller, J. Colehour, K. Dunkelberg, Fa. Boeing, “Engine Case Externals Challenges and Opportunities”, Paper der ISROMAC 7 Konferenz 1998, pages 1604-1611.
3-25 “737-400 Crash Investigation Turns to Engine Instruments”. periodical “Aviation Week & Space Technology”, January 23, 1989, page 67.
3-26 “RB.211 investigation”, periodical “Flight International”. 25 January, 1973, page 106.