If It Ain’t Broke, Fix It

As an Airbus cruises toward Singapore, new diagnostic tools spot trouble before it happens.

Robert Poolarchive page

September 1, 2001

the airbus 340 is an hour or so into its 11-hour flight from Hong Kong to Auckland, New Zealand. Twelve kilometers below, the islands of the Philippine archipelago are sliding by off to starboard. That’s when, deep inside one of the plane’s four General Electric-made engines, small bits of insulating skin begin to peel off and fly out the back. Their departure breaches the surface and opens tiny passageways into the compartment where the jet fuel burns. As cold outdoor air seeps in, the compartment’s temperature starts to drop.

In the cockpit, the pilots are aware of none of this-the deviations are still too small to show on their instruments. But the event has not gone unnoted. For starters, a thermocouple in the engine compartment has recorded the slightly depressed temperature. Then, three hours into the flight, the onboard computer that has been collecting readings from the engines uploads the data to a satellite, which relays the information at light speed to a computer in Glendale, OH, just north of Cincinnati. This machine notices the temperature anomaly, and after taking into account other sensor readings, as well as details about the particular engine’s maintenance history, correctly identifies the likely cause: delamination of the skin covering the engine’s thrust reverser. The situation poses no immediate danger to the aircraft. But the airline is notified by telephone, and when the plane arrives in Auckland, mechanics are waiting with the parts needed to repair the skin. They finish in time for the aircraft to leave as scheduled on its next flight.

Five years ago, this could not have happened. The delamination would have worsened gradually, flight after flight, until a mechanic noticed it in the course of a visual inspection. By that point, it would have required extensive and expensive repairs that would probably have forced the delay or even cancellation of the aircraft’s next flight and possibly kept it out of service for days or weeks. But today, thanks in large part to sophisticated new statistical techniques that make it possible to detect previously invisible patterns in data, remote monitoring and diagnostic devices are able to spot many problems as soon as they occur-and sometimes even before. “Doctors talk about people in the future walking around with heart-monitoring devices that will give advance warning of heart attacks and other problems,” says Gerald Hahn, the recently retired founder and manager of General Electric’s Applied Statistics Program in Schenectady, NY. But with complex machinery like aircraft engines and locomotives, he says, “we’re already there.”

To date, remote monitoring has been applied mainly to big-ticket items where unexpected breakdowns can cost a company tens or hundreds of thousands of dollars. “In the next few years, remote monitoring will be included in the majority of car models,” says Laurence Fourchet, an analyst with the market research company Frost and Sullivan. The promise, she says, is the same that has driven airlines and railroads to install the systems: to sniff out clues that a system is heading for a failure so that preventative action can be taken. Ideally, Fourchet explains, “someone will tell you that sometime in the next few days you need to check the engine-before the breakdown occurs.”

This trend toward self-diagnosing systems appears headed for even greater ubiquity, as efforts are already under way to develop similar capabilities for household appliances like refrigerators, washers and dryers. In the not-too-distant future, an engineered world studded with sensors, computing chips and communications ports, and embedded with sophisticated mathematical tools, could banish the cost and headaches of machine downtime.

Just-in-time Maintenance

There is nothing new about sticking monitors into machines to notice when some variable wanders outside its normal range. The temperature gauge that’s been in automobiles for decades is one simple example-the mechanical equivalent of having patients walk around with thermometers in their mouths.

But the remote monitoring emerging today is of a different magnitude altogether. Think of strapping several dozen monitoring tools onto the patient-blood pressure and heart rate, electrocardiogram, brain wave sensor and more, with the doctor following the data remotely, analyzing it with reference to each patient’s medical history, and then offering regular diagnoses that might include advice on when to pop an aspirin before a headache even occurs. That is the sort of continuous checkup that is now becoming practical for sophisticated machinery.

The value of such supercharged monitoring is obvious to anyone who has ever missed a meeting because a flight was canceled or lost electricity because some part in a utility substation broke. Catching problems early means less expensive repairs, less downtime and less disruption of service and schedules. At the same time, knowing what is going on inside an engine or other piece of equipment can provide the confidence to hold off on a replacement or repair until it is necessary. The ultimate goal is just-in-time maintenance-knowing exactly what repairs to make and when. “There are two kinds of mistakes-replacing too soon, and replacing too late,” Hahn says. “We want to minimize both.”

Many companies have been designing and building equipment so that its health can be monitored during operation. Asea Brown Boveri, the giant European industrial conglomerate, puts remote-diagnostics capability into the propulsion systems it makes for cruise ships and other large vessels; an onboard computer collects operating data and forwards it via satellite to Helsinki for analysis. Turbine Technology Services operates a facility in Orlando, FL, that remotely monitors data from the turbines used in power plants. Other firms monitor the performance of computer equipment such as servers and routers, while the heating, ventilation and air-conditioning systems in many large buildings are equipped with devices that allow engineers to spot problems by observing such variables as airflow and temperature.

But perhaps no company has done more work in these areas than General Electric. A diversified conglomerate with some two dozen divisions, GE manufactures complex industrial equipment-power turbines and ship propulsion systems, in addition to aircraft engines and locomotives-alongside consumer items such as appliances and lighting products. GE is at the forefront of remote monitoring and diagnostics in a variety of fields, says Nick Heyman, an analyst at Prudential Securities who follows the company. Heyman points specifically to GE’s leadership in monitoring of aircraft engines; it was a GE monitoring station outside Cincinnati that spotted the delamination in the engine of the Auckland-bound Airbus. Other GE facilities keep an eye on locomotives, merchant-ship engines, gas turbines and medical imaging devices. The company’s applied-statistics program, founded in 1975 at the corporate R&D center in Schenectady, has grown to be one of the world’s most respected groups in that discipline, and it has provided the ammunition for several GE divisions to develop remote monitoring and diagnostics capabilities. The developments at GE therefore offer a case study of how remote monitoring and diagnostics is transforming the way that complex technological systems are kept in top working condition.

Figuring out from afar what’s wrong with a piece of complex machinery entails two separate but related functions. One is to detect anomalies as soon as they appear; that’s what happened with the Airbus. Complementing that is the forecasting of problems before they even arise. Both capabilities have improved tremendously over the past few years-thanks, Hahn says, to a convergence of three very different advances. One is the development of smaller, lighter-weight sensors. Second is the tremendous growth in computing power. Third-and perhaps most important-is the emergence of new statistical techniques that allow researchers to distill useful information from mountains of data.

On the sensor front, much of the progress can be traced to the same sorts of processing advances that have shrunk computer chips so remarkably over the past 20 years. Another factor is also at play, says Larry Abernathy, head of the diagnostics engineering team at GE Engine Services in Evendale, OH. That is the trend to replace mechanical controls with electronic ones that make it much easier to gather data. For example, today’s “fly-by-wire” jets are controlled by dozens of computers that know precisely what is going on in the various systems that they command.

But getting the relevant data is rarely the whole battle. Knowing what to make of it is often the difficult part, and it is here that the other two advances-in computing and statistical analysis-come into play. Consider, for example, the challenge of figuring out what part or system might need attention on a 200,000-kilogram locomotive. Although trains may be commonly perceived as relics from the 19th century, modern locomotives are relentlessly high tech. “Think of them as rolling power plants,” says Joe Cermak, leader of the Remote Maintenance and Diagnostics Center of Excellence at GE’s Transportation Systems division in Erie, PA. The AC6000, GE’s newest and most powerful locomotive, boasts a 6,000-horsepower, turbocharged, fuel-injected diesel engine that spins an alternator to generate some four megawatts of electric power. That’s enough electricity to run 3,000 homes, but here it drives six independent traction engines that give the locomotive enough oomph to pull 100 cars at up to 120 kilometers per hour.

Two dozen microprocessors control the locomotive, allowing the entire operation to be directed by a single engineer sitting before a computer console in the locomotive’s cab. Sensors monitor nearly every variable of interest, from the locomotive’s speed and horsepower output to the voltages, torques and speeds of the individual traction motors to the battery voltage and current. All of this information is collected at GE’s service center in Erie, PA, where about 50 technicians and engineers monitor nearly 300 locomotives belonging to a major U.S. railroad.

Much of this information was available in some form 20 years ago. The difference now is its accessibility for analytical purposes. “We used to have data stored in file cabinets,” Hahn remembers. To make sense of the data required manually entering it into a computer. Even then, the computers were not fast enough to sift through more than a fraction of the information in a reasonable amount of time. As a result, it was very difficult to spot any but the most obvious problems.

Now that has changed. As Jason Dean, a systems engineer with GE’s Remote Monitoring and Diagnostics group, explains, even something as seemingly simple as spotting a clogged fuel filter was hit-or-miss. A stopped-up filter can cut a locomotive’s horsepower by 20 to 30 percent. That’s a deficit small enough to go unnoticed when the load is light, but one that can slow the train considerably when it is hauling many cars or heading up an incline. And one slow train can back up the entire rail network.

Diagnosing this malfunction is not straightforward. The major indication that a filter is plugged is increased fuel usage, Dean says. But other factors-such as air temperature, the horsepower being produced and the train’s speed-can cause fuel usage to vary by as much as 20 percent, and so the engineer who sees a drop in efficiency cannot know from that alone what’s at the root of the problem. The solution, in theory, is simple: collect historical data on each locomotive’s operations and apply statistical analysis to create a model of the train’s performance. If fuel usage jumps significantly above what the model predicts for a particular set of conditions, the fuel filter should be replaced.

Embracing the Data

The feeble computers of the past were not up to this job, though. “Years ago,” Abernathy says, “we would see a shift in the data, and we would try to handle it using three or four or five parameters.” Engineers would make educated guesses as to which were the most important variables-the train’s speed, say, and the ambient temperature-and then perform a regression analysis (a standard technique that teases out the effects of different factors on a variable). But no matter how carefully the parameters were chosen or the calculations performed, the model’s predictions were inevitably impaired by the limitation on how much the computer could handle.

Today’s technology has removed that limitation. Twenty years ago, statisticians spent a great deal of effort finding clever ways to limit data and still get reasonable answers. Now, empowered with faster number-crunching machines, they embrace the data. “We can look at tens or hundreds of parameters, and we can determine relationships that we could never see before,” Abernathy says. Picking out subtle relationships between variables in operating conditions and in system performance is important, because those relationships, if they are not accounted for, can cause the model to spit out inaccurate results.

Here is where the third key advance-in statistics-comes into play. Statisticians have developed a number of analytical tools that take advantage of this increased computing power to create more accurate projections than are possible with classical regression analysis. Among the most important is a technique using predictive models called “decision trees,” or, more particularly, “classification and regression trees.” This method is well suited for such tasks as predicting whether a locomotive engine will fail on a given outing. It does not assume, as regression analysis does, that the relationship between the input variables (age, distance traveled, operating temperature, oil pressure and so on) and the output variable (whether the machine fails or not) is a matter of simple extrapolation. Given enough data, a decision tree can model virtually any relationship, no matter how complex. It can also handle incomplete data-such as readings made when a sensor is malfunctioning-much more easily than can regression analysis. And whereas regression analysis generally reaches a point of diminishing returns, past which gathering more data will not improve the predictions, that’s not the case with decision trees. With the new method, “more data is always better,” says Jerome Friedman of Stanford University, one of its developers.

The decision tree works, Friedman explains, by dividing a set of data into smaller and smaller partitions until it reaches a best partition for predicting a particular outcome. The data might, for example, consist of thousands of sets of readings made on hundreds of locomotive engines. The outcome in question might be whether a given engine will run smoothly for another 5,000 kilometers. The best indicator of whether an engine will fail might be whether its current operating temperature is above or below a certain level, or perhaps whether it has covered more or less than a certain distance since its last major overhaul. The decision tree begins by slicing the data into the two subsets that best correlate with the two divergent outcomes. Each of the resulting two subsets of data points is then split by the same best-prediction criterion; the resulting four subsets are divided; and so on, until further divisions do not improve the predictive value.

Thanks to such statistical tools, people like GE’s Abernathy can wring an astonishing amount of information from the signals trickling out of the sensors on technological devices. Jet engines cost between $5 million and $10 million and need an overhaul every three to five years. Because this procedure is also expensive-costing $500,000 to $2 million-airlines strive to maximize the time between one overhaul and the next. An equally important goal is to avoid having too many engines due for major servicing at the same time: airlines have only so many spares on hand.

An overhaul is dictated by one of three factors. The easiest factor to predict is called “life limits on parts.” The Federal Aviation Administration demands that certain critical components be replaced after a given number of flights or flight-hours. About one-third of engine removals are due to this, says Rusty Irving, head of GE’s information technology lab in Niskayuna, NY. Another third occur because the exhaust gas temperature of the engine has risen too high-an indication that parts are wearing out and forcing the engine to run hotter to create the same thrust. The rest of the removals stem from a variety of hardware problems, such as a bird or other foreign object being sucked into the engine, or a leak or crack that exceeds some predefined limit.

The ability to forecast when engines need servicing would allow the company to plan how many spare engines it needs to have on hand at any given time. But in the past, Abernathy says, techniques to predict when an engine would need servicing were little more than blunt tools. The statisticians could figure how soon the typical life limits on parts would force a given engine to be overhauled. They could also calculate how rapidly the exhaust gas temperature increased on average for all the planes in a fleet, which gave a rough indication of how soon an engine might have to come off for that reason. “Then we’d throw in 20 or 30 engines on top of that”-to account for removals due to hardware problems-“and that was our number,” Abernathy explains.

This would work fine if every engine behaved like an average engine. In reality, though, Abernathy says, engine deterioration rates can vary tremendously. The new statistical tools allow GE Engine Services to predict when each individual engine is likely to need an overhaul, then combine those forecasts to create a month-by-month projection of how many engines will be sent in for repair. And, he adds, because the statistical tools also predict why each engine will need servicing, “we can even predict the turn times”-that is, how long each engine will be in the shop. This enables the airline to look a year or more into the future and see how many engines are likely to be “off wing” at any one time. If one month looks to have more than its share, the airline may opt to move up some of the anticipated overhauls to lessen the need for spares and lower the demand on the overhaul center.

Ultimately, improvements and cost reductions in sensors and computing power will enable remote monitoring and diagnosis to move to smaller items, including consumer products. Major carmakers are already working on systems that will spot and report maintenance problems in cars. At GE, the appliance division is developing refrigerators, washers and other machines that can receive instructions and report operating conditions over the Internet, while Whirlpool, Bosch, Samsung, IBM, Cisco Systems, Microsoft and other companies are working on standards and designs for a variety of Internet-enabled appliances.

One idea under consideration is to build appliances capable of making their own service appointments before their owners even recognize something is wrong. If, for instance, an oven uses too much electricity to achieve the desired temperature, it might alert a service center that the heating element is close to breakdown. Engineers say the major hurdle to realizing that scenario is not technical but psychological: getting people used to the idea of a repairman ringing the doorbell and announcing, “The oven asked me to come.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.