MRO Magazine

Measuring Maintenance Effectiveness

Many years ago, the typical cause and effect diagram (the fishbone) had four categories for its branches -- Manpower, Materials, Method and Machine -- the four Ms. As companies have become more analyt...

December 1, 2004 | By Cliff Williams

Many years ago, the typical cause and effect diagram (the fishbone) had four categories for its branches — Manpower, Materials, Method and Machine — the four Ms. As companies have become more analytical through using SPC, Lean, Six Sigma, etc., there has been a realization that a fifth M needed to be added — Measurement.

When so many decisions are being made based on the information gained from the process, it becomes obvious that the correct and most relevant measures need to be taken. Whether they are called Limits, Goals, Targets or Key Performance Indicators (KPIs) doesn’t matter as long as you are measuring those things that have the greatest impact on your business and you use the measurements to drive continuous improvement.

A word of warning at this point — if you’re not going to manage it, don’t bother measuring it. There is no bigger waste of time and effort than measuring for the sake of measuring.

As we seek to control maintenance, we need to determine which measurements we should be making in order to identify the areas ripe for improvement. There are many measurements that can be applied in the maintenance field, ranging from the all-encompassing Overall Equipment Efficiency (OEE) to the more obscure Total Productive Maintenance (TPM) measure of ‘Critical Equipment with Failure Mode Effect and Criticality Analysis’ ratio.


Benchmarking against World Class standards or Best in Class is a good start, but the simplest use of measures is to set incremental gains in all that you measure in the maintenance department. World Class standards will identify the priority of the gains.

The danger of using generic standards is that they are very often weighted by the different ways in which companies report their measures. For instance, some companies back out scheduled downtime from their availability, while others include every minute; some companies set their own speed targets where others use design speed.

Another interesting question is, “Whom do ‘Best in Class’ companies benchmark against?” The answer usually is, “Themselves,” so why not start out that way?

One of the most widely used measures is Overall Equipment Efficiency, as it is a combination of measures of availability — the ability to run at desired speed with minimal defects.

Obviously, without a well-maintained piece of equipment, this number would be low. A World Class measure for OEE is 85% with the typical measures of Availability (90%) x Speed or Production rate (95%) x Quality (99%), but when Six Sigma controls are applied, the OEE standard jumps dramatically.

In most organizations, it is believed that maintenance’s biggest influence is on availability as measured in downtime — the World Class mark being 2.8%, although this varies from industry to industry.

What very often is missed is the reduced run speed due to ‘shake, rattle and roll’ — things upon which maintenance can have a major influence.

Another measure that is often used is Mean Time Between Failures (MTBF). This is derived from Total Scheduled Operating Time divided by the Number of Failures. The danger of using this measure in isolation is that it doesn’t really take into consideration any weighting that may occur.

For example, a quarry truck is required to run for an eight-hour shift. It averages 16 failures per shift. This gives us an MTBF of 8/16=0.5 hrs. This would lead the operations scheduler to believe that he can only rely on the truck for 30 minutes at a time.

If we look at the distribution of the failures, it tells a different story:

Hour 1 2 3 4 5 6 7 8
Failures 0 0 0 0 0 0 8 8

Based on the MTBF, the quarry company would be looking at an alternative truck. Based on the distribution and further analysis (the truck was found to be overheating after six hours on the job), a new water pump was installed, the radiator was flushed and the problem was solved.

Mean Time To Repair (MTTR) is an indication of how quickly equipment is returned to service and is found by calculating Total Downtime Caused by Failures. As my focus has always been to prevent failures, I believe that time and energy should only be spent on this measure when everything else is in place.

It’s a little like buying a new fire truck so that the fires can be put out more quickly, rather than hiring a fire prevention officer.

There should be a focus on moving the work from a reactive mode, first to preventive, then to predictive, with the goal being to have 75% of all work planned and scheduled.

The measure of the effectiveness of your predictive maintenance will be seen in the availability part of the OEE measure, since measuring the failures you prevent is impossible.

Another measure of how your effectiveness is improving is in your backlog list — how many jobs are waiting to be completed. This can also be measured as lead time or the time it takes from when a work request is submitted until the work is completed.

Costs are another measure of maintenance effectiveness and as with failures, can be measured in different ways. The most relevant method in a business sense is the cost per unit produced, or Maintenance Costs divided by Total Units Produced (tons, sq ft, cars, etc.).

However, this can obviously be influenced by many factors outside of maintenance, such as production schedules, changeovers, waste, etc.

Costs Per Piece of Equipment is a good indicator to be benchmarked. If you are able to compare similar pieces of equipment within the plant or company, the ones that are the problem pieces will become apparent.

With a fully functioning computerized maintenance management system (CMMS), you will be able to monitor the Cost per Repeat Repair, which will point toward the possibility of the need for Root Cause Analysis. It also will indicate where you are only dealing with the symptoms and not the problems.

Overtime cost is a common measure that is closely monitored, although the blind reduction of overtime can cause more problems than it solves. For example, at a paper mill, the average overtime cost in the maintenance department was 12%, which was considered acceptable — except for the Truck Repair area, where overtime was running in excess of 30%.

As this department was staffed by two truck mechanics on the day shift who responded to calls on the off-shifts, the solution seemed simple. One mechanic was put on the afternoon shift, reducing the reliance on overtime by 50%. When, after three months, the overtime number came down to 15%, it was declared a success.

However, in the fourth month, the mill was shut down for 10 hours on two occasions because no trucks were available to load the conveyors. The average downtime at the mill was costed at $150 per minute, totalling $180,000 — three times the wages of a truck mechanic.

It transpired that the jobs that normally took two men to do were left aside until there were so many trucks in need of repair that the mill had to be shut down.

The decision as to what measures should be taken can only be made by those involved in the process. Each company has a culture that will determine what is important to them. The only common theme throughout all industries is to not look at one measure in isolation.

Measures will inevitably drive behaviours, so when choosing your measures, be sure they will drive the behaviours you really want.

Cliff Williams is engineering and maintenance manager at Multipak Ltd., Mississauga, Ont., and a consultant with TMS Total Maintenance Solutions of Markham, Ont. He can be reached at


Stories continue below

Print this page