MRO Magazine

Maintenance by Numbers: Detecting the slightest deviation from the norm


November 14, 2001
By PEM Magazine

A century ago, predicting when industrial equipment would break down was a matter of human intuition fueled by experience.

When electronics appeared, miniaturized circuitry made it possible to gather more precise information critical to the operation of the equipment from previously inaccessible areas.

Eventually, all of this sensory input was fed into a microprocessor and displayed on a single computer screen for the critical eye of the operator. As microprocessors evolved, specialized software was devised to sound alarms both visually and audibly when pre-determined operating parameters were exceeded.

Current monitors are able to detect the slightest deviation from the norm, provide the operator with a probable outcome if the situation is allowed to develop, and get his or her attention by flashing warnings on the screen and sounding alarms. Software developed to monitor industrial processes gathers data from any number of sensors, compares it with various models and projects where any deviation is likely to go. The whole procedure would appear, to the innocent bystander, strangely intuitive and disturbingly human. In reality, there is little comparison. While software is unlikely to have the intuitive capacity of an experienced human being, unlike human beings, the monitoring systems cannot be distracted, will not lose interest late in the day, nod off to sleep during the night shift, or perform any differently late Friday afternoon than on Monday morning.


The job of monitoring functions is relatively easy when dealing with mechanical devices that are full of bearings, gears, cams, chains and pulleys, all of which are subject to deformation and wear from the dynamic forces involved. Not so easy to monitor are industrial operations such as the process of continuous casting in the steel industry, where metals are heated to a molten state, then cast continuously through a mould.

In continuous steel casting, the molten steel is poured into an open copper mould which is cooled internally by water so that the steel which comes in contact with the copper cools enough to solidify, forming a solid shell with a liquid core. The solidified steel shell is continuously withdrawn from the mould and progressively cooled until the liquid core solidifies under controlled cooling conditions. Copper is used for the mould because of the high rate at which it conducts heat.

Problems begin when the solidifying shell ruptures because of inconsistent cooling or adhesion to the mould and molten steel spills through the rupture when it is drawn beyond the end of the mould. This can result in extensive damage to the caster, high maintenance cost and loss of production.

These "breakouts," as they are called, can be avoided if the casting speed is reduced when the steel fails to solidify properly. This, of course, reduces productivity. The operator can avert a breakout if he can recognize what is happening in time to react, which is not always the case.

The casting industry tries to avoid breakouts, or at least reduce their number through a system of pattern-matching. It’s based on past caster breakout events; when current data matches or nears data gathered during previous breakouts, the alarm goes off. Problems arise with this system when a situation that develops contains one or two new variants that the system fails to recognize and the alarm bells fail to ring.

Management at Dofasco’s Hamilton, Ontario, operation, which sits on 720 acres of real estate and employs about 7,400 people at its steel smelting and refining facility, saw this procedure as badly in need of improvement.

Accordingly, Dofasco set about in 1996 to develop an on-line monitoring and fault detection system for continuous casting based on a multivariate (MVS) model, a system that was, until then, largely an academic exercise at engineering institutions.

It enlisted the aid of professors at nearby McMaster University and proceeded to develop multivariate statistical models for their operation, which needed improvements in the monitoring of continuous casting as well as controlling sulphur content in its steel-making process.

Dofasco is one of North America’s largest steel makers. Its product lines include hot rolled, cold rolled, galvanized, Extragal, Galvalume and tinplate flat rolled steels, as well as tubular products and laser-welded blanks. The company supplies steel products to customers in the automotive, construction, energy, manufacturing, pipe and tube, appliance, packaging and steel distribution industries.

Developing the model
Working on the caster, Dofasco began by employing Principal Component Analysis (PCA) which boils down a matrix of data into a set of vectors and scalars. This procedure reduces a sea of data into a few variables with no loss of information. The results are then used to calculate test statistics from which the condition of the process may be inferred. The system they developed is designed to generate warnings and/or alarms when warranted so that corrective action may be applied, either manually or automatically.

Currently, the operator at Dofasco No. 1 Caster is required to take corrective action based on the information.

The (MVS) system the company developed will filter specific signals to address non-stationarity, or drift, and will compensate for missing or invalid data. It will switch models while in operation, consolidate model inputs to simplify processing, and implement alarm logic that works with a detection algorithm to reduce false alarms.

As usual, the garbage in/garbage out rule applies. Selection of the process parameters to be used as inputs in the model is critical. These parameters include lagged variables to add dynamic information. They are used to plot events on a curve instead of only on a straight line. Appropriate detection thresholds for the test statistics must be selected.

The model developed for continuous casting uses the following variables: mould thermocouple readings; lagged mould thermocouple readings; temperature differential between vertical pairs of thermocouples; caster speed; lagged caster speed; mould width; mould level; mould oscillation frequency; mould cooling water temperature differences between inlet and outlet ports; mould cooling water flow rates; tundish weight; tundish temperature; and calculated clogging index. This index is derived from the ratio of the actual position of the control valve as compared to the predicted position in the flow of hot metal from the tundish into the mould to regulate mould level.

Thermocouple input is critical. These sensors are positioned around the mould in two rings, upper and lower, set in pairs, equally-spaced. Data is sampled at the preferred rate of no less than twice per second. All of these inputs are variable, including the grade of steel in process.

The number of active thermocouples changes with cast width, effectively changing the number of inputs. This necessitates separate PCA models for each predetermined cast width range. Two models are required for the No. 1 Caster, one for wide slabs and one for narrow. Each covers the complete range of operating speeds. Changes in speed require the addition of lagged variables to add dynamic information to the model. Past readings of the cast speed measurements are sampled and employed to develop the PCA model. The speed over the previous five consecutive samples, covering 2.5 seconds, and the speed as measured 10 samples or 5 seconds ago, are used.

Detection thresholds
Detection thresholds are developed with the use of off-line simulation. This requires distinguishing between normal and abnormal operation and identifying the data as such. The resulting data are then used to simulate the operation of the caster and generate model outputs and subsequently, test statistics. The goal is to establish levels that will not cause the system to alarm under normal operation and always alarm under abnormal operation. This is not achievable in practical terms, but may be optimized. Current thresholds provide a long-term alarm rate of under two percent.

During operation, data are continuously sampled and input to the computer which also stores past readings. These can be used in the model calculations and to filter data. The computer provides data to the visual display which includes graphical diagnostic information, to allow the operator to monitor the system and take corrective action as required. The computer may also be programmed to automatically adjust caster speed in accordance with pre-determined alarm thresholds without any operator intervening.

Researchers at Dofasco and McMaster devised a method of compensating for shifts in absolute temperature in the thermocouples which employs an Exponentially Weighted Moving Average (EWMA) filter to dynamically calculate the mean of the sensor readings. This calculated mean is then extracted from the thermocouple temperatures to generate a deviation signal used by the model. This method is also used on the coolant flows and temperatures. Other signals are not filtered in order to avoid loss of information vital to the process.

The system thus developed will continue operation in the absence of a complete set of input data. On occasions where sensor signals are invalidated for such reasons as sensor calibration procedures, sensor failure, sensor drift, etc., the system is capable of tagging the inputs as missing and works with the balance of the inputs to provide monitoring as usual without the annoyance of false alarms. When it is obliged to work with limited input, it will do so successfully. This is achieved in the model by modifying the parameters to ignore the missing data, increase the contribution from the valid data to provide results consistent with a full set of data.

Alarm circuits
The alarm circuits in the system are self-monitoring. Filtering logic is applied to the detection results to ensure the validity of the alarm. The logic determines if the alarm is persistent before issuing a warning. The alarm condition has to persist for at least five samples (2.5 seconds) prior to any indication of an alarm on an operator screen. The system will react in three levels:

  • Level 1 — Condition normal;
  • Level 2 — Deviated from normal with breakout possible; and
  • Level 3 — Significant deviation with breakout likely.

The screen turns to amber during the level two alarm and to red during Level 3 with audible alarm added, which prompts the operator to slow the casting speed until conditions improve.

Dofasco has also applied MVS technology to monitor its process for controlling the sulphur level in its steel making process. The desulphurization process required the use of Projection to Latent Structure (PLS), another algorithm within MVS. The resulting MVS model was successfully implemented as part of a model-based control application.

If all of this seems like an incredibly complex procedure, it is well to remember that the result can optimize monitoring to the point where malfunctions become rare, if not history. Dofasco’s records show an average of 13 breakouts per year from — down to 8 breakouts in ’97 when MVS was first implemented; down to three breakouts in the year 2000, following continuous development. When tracking breakouts under the casting speed, the system was specifically designed for at least 1.2 metres/min or greater, the total number of breakouts, beginning at five in ’97, was reduced to a single incident in the year 2000.

There’s no question that MVS can be applied as a predictive measure in many other industries. Its strength is in its ability to harness a whole cloud of seemingly unrelated data into a cohesive procedure that lends itself to fine-tuning, becoming increasingly accurate as time goes on.

Harnessing data
Management at Dofasco points out that MVS allows industry to derive greater value from the enormous pool of data collected by expensive information technology (IT) systems. They see MVS technology as capable of converting the information it extracts from these data pools into analytical and predictive models critical to optimizing industrial processes.

John MacGregor, a professor at McMaster’s Department of Chemical Engineering, says, "Companies are spending hundreds of millions of dollars on IT, but less than one per cent of the data pool can be used because they don’t have the technology to properly analyze it. MVS unlocks the door on the remaining 99 percent."

MacGregor has previously used MVS to a limited extent within the chemical industry. The application at Dofasco is a steel industry breakthrough, but the MVS procedure is not limited to steelmaking. Marc Champagne, Corporate Manager for Process Optimization at Tembec Inc. of Temiscaming, Quebec, has had a hand in the application of MVS in the pulp and forestry industry.

"We call our multivariate analysis program ‘Quality Alert’, because that is one of the key functions of multivariate analysis," says Champagne. "Quality Alert has allowed Tembec to improve its product quality and reduce its operating costs by providing the tools to quickly identify production problems and to provide us with better process understanding to improve our products."

Some of the problems tackled at the Temiscaming site using MVS include:

  • Multivariate SPC charting of a paperboard wet end system;
  • Predictions of product quality parameters on-line from the specialty cellulose mill;
  • Multivariate SPC charting of batch digesters; and
  • Multivariate SPC charting of dewatering of the activate sludge from waste water treatment plant.

"We are using the process data collected on one-five minute sampling rate from our different distributed control systems through a OSISoft PI data historian and manual entries from our grade management software Gradebook from Mountain Systems," says Champagne.

"Tembec Inc. is a fully-integrated forestry company producing market pulp, paper, paperboard, engineered wood products and lumber," he says. "We are the third largest market pulp producer in the world and the 10th largest exporter in Canada, according to Report on Business."

The procedure at Tembec was also developed in association with the program at McMaster.

Michael Dudzic, Dofasco’s Manager of Process Automation Techology, calls MVS an early warning system. "MVS’ predictive capability will allow us to respond to potential problems far sooner, and to take corrective steps before production breakdowns and disruptions occur," he says. "We get a 10-15 second jump on something about to go wrong, which gives us enough time to slow down or take corrective action to avoid the problem."

In addition to Dr. MacGregor, who holds the Dofasco Chair in Process Automation and Information Technology at McMaster, key members of the team include Dr. Theodora Kourti and Vit Vaculik, along with a group of MVS experts at Dofasco where the development has resulted in two patents so far.

"There appears to be a great potential for most manufacturing sectors around the world, including automotive, pharmaceutical and petrochemical," Dudzic says. "Dofasco is currently implementing this technology at our No. 2 Caster which should be in operation in the near future. We are also exploring commercializing opportunities of this application for other casters."

Ed Belitsky is a Toronto-based freelance writer. You can reach him at: