MRO Magazine


Taking the ‘OUCH out of OEE

In a fantasy manufacturing plant, the equipment would be available and chugging away all of the time, performance would always equal the equipment's design capacity and the product quality would never...

In a fantasy manufacturing plant, the equipment would be available and chugging away all of the time, performance would always equal the equipment’s design capacity and the product quality would never fall short of target specifications.

Although this is unachievable in the real world, a method called Overall Equipment Effectiveness (OEE) can track how far short of these ideals one’s plant is falling so remedial steps can be taken.

OEE is a combined measure of a plant’s availability, performance and quality in a single number, with each of the three variables weighted to reflect their relative importance to that plant. Last year Vacouver-based Teck Cominco’s Lead and Zinc smelter in Trail, B. C., began obtaining OEE numbers for some of its 16 plants, with the ultimate goal of having an easier and more consistent way of comparing performance across the plants.

The Trail operation is one of the world’s largest fully-integrated zinc and lead smelting and refining complexes. Its production capacity totals approximately 290,000 tonnes/year of zinc and 120,000 tonnes/year of lead, along with 20 other metal and chemical products.

Trail intends to use OEE to trend plant performance and direct improvement efforts where they are most needed. However, says David Williams, the senior reliability engineer at the Trail operation, “The topic is simple, but the implementation is tough.”

Trail tried to implement OEE in 2001, before Williams joined the reliability group, but shelved the effort because it could not electronically gather the kinds of information required to feed the equations used to generate OEE numbers.

Over the next four years, Trail implemented software that, coincidentally, was able to pull and process from its Distributed Control System (DCS) just the sort of data required to generate OEE numbers. For example, Suite Voyager (part of the Wonderware software from Invensys Systems) extracts process data from the DCS and puts it where other software can use it. “We use it in several different ways, but its acquisition led us to rethink the implementation of OEE,”Williams recalls.

Trail also purchased Integrated Process Management software (from Dawn International), which pulls data from Suite Voyager, and monitors and tracks key process variables. A third software package Trail purchased, called Qlikview (from QlikTech International AB), displays data. “It is a very good at taking various kinds of data and showing them in graph tables, speed dials, etc.– displaying them in a way that tells a story,”Williams says.

Thus equipped, Trail found it could resume its efforts to set up OEE, with the ultimate goal of focusing maintenance and reliability efforts on those plants with lower OEE numbers.

Williams hints at the challenges OEE is meant to address. “Most plants are set up to judge performance by their production performance — eventual tonnage. But some plants can make their production, and not run well; for example, run fast, break, fix it, run fast. We want to capture all that downtime. If we are supposed to run at 20 tonnes an hour, I don’t care if we run above it. But I do care about the dips below it. We want to know when we dip below our design levels and why. It is really hard to tell if the plant dips below target rates and why.”

There have been many challenges along the way to getting the first OEE numbers. For example, the data requires a lot of work before the other software can read it and OEE numbers can be generated; the IT specialist who has been devoting on average 25% and sometimes as much as 80% of his time to this task has only been on the project for about 14 months.

Obtaining useful OEE numbers is an iterative process, Williams has discovered. “Say a tag (a specific crucial performance measurement) for a plant will be ‘flow’. We set the threshold, but say we determine that just the slight noise in the instrument indicates that there is flow when there is actually none. So we have to raise the threshold.”

A tag may be in the wrong place, say, at the point where a flow could be recirculating or leaving the plant. The tag might be measuring recirculating flow even when the plant is actually down. The tag has to be in the correct place so it does not miss the fact that there has been a significant mechanical failure.

Sometimes the instrument that best shows flow when it is running well doesn’t show it when the plant is down.

Something as superficially straightforward as batch time, which may need to be known in order to monitor performance, may be vexing to measure properly. Perhaps the start time should be when a level is reached, not how quickly that level is reached, because reaching that level may be out of that plant’s control and originate with another plant. Or quality may be straightforward to define, but not availability.

Perhaps the run rate in a plant is affected by variables whose effects do not become immediately apparent. In the lead refinery, for example, improperly controlled variables take a week to manifest themselves as performance issues. “You have to figure out approximations for issues that will affect you in the future,” says Williams.

Consider downtime on performance in one plant that is interdependent with another plant, he adds. “Depending on feed tank volume, a small loss in taking feed out of a tank is much more significant if the tank is nearly full and filling faster than it is being emptied. There is a risk of holding up the upstream plant. The processes are understood, but there are often heated discussions of how to define performance.”

In one plant that obtained OEE numbers in January, 2008, even though the quality portion was not meeting spec, a business decision was made to run. It so happens that in this case the negative impact on quality in one plant gives more value in one of the downstream plants. “We have to take a look at our process management criteria. I suspect that the impact of the quality is outweighed by the value of the product, and we will have to reduce the quality criteria. Getting valid OEE numbers that are accurate reflections of availability, throughput and quality is an iterative process,”Williams explains.

When plants are interdependent, then effects on upstream or downstream plants have to be considered; e. g., one plant’s output quality may be critical to downstream plants, meaning that quality in that plant’s OEE has to be given extra weight to reflect its importance compared to, say, a plant’s output quality that has no effect on the good functioning of other plants. Or perhaps a plant’s most important trait is throughput, and it is more important that it not hold up other plants. In that case, that ‘performance’ third of its OEE must be given more weight than ‘quality’ and ‘availability’.

The biggest challenge at present, however, is training Trail’s plant operators to properly annotate information, that is, properly attribute downtime to specific causes. At the end of a shift, for example, there may have been six minutes of downtime. In order to fix the cause of that downtime, the operator has to properly annotate, or attribute, that time to the responsible equipment. Too much downtime showing up as ‘unaccounted’ on a shift indicates that the operators need more training and coaching to consistently annotate the causes of downtime.

Some benefits are already coming in from the foundation elements of OEE. Annotations suggest shift-specific issues and equipment problems. Early OEE numbers obtained in January have directed the reliability group to quality issues in parts of one plant, and away from possible wild goose chases.

As more OEE numbers become available, the reliability group will be able, for example, to figure out why a piece of equipment is breaking and then direct maintenance to perform repairs. Ultimately, says Williams, “We’d like to have it as part of our operator screen, so if values deviate, the operator can take corrective action immediately and take remedial action; fo
r example, cleaning a screen or switching to a backup pump. I might see that the output for one of my pumps is dying. I could switch to the spare pump and put in a work order to get the pump replaced.”

That may seem like a fantasy to some, but it’s the reality that Teck Cominco’s Trail operation is steadily working towards.

Montreal-based Carroll McCormick is the senior contributing editor for Machinery & Equipment MRO.