MRO Magazine


Ten steps to successful reliability-based asset care

The road to true reliability-based asset care has numerous twists and turns. For many, a drive down that road ends in disaster because data is missing, inaccurate or misinterpreted. Described below are ten steps on how to stay the course.

1. Take a proper inventory of all assets to provide a baseline.

For all equipment and component parts, log all relevant tombstone data such as unique asset number, parent-child relationships, asset specifications and warranty information. Many CMMS packages allow the user to prepare a graphics parts book that stores a CAD drawing for each piece of equipment. Some of these packages allow drill-down capability to zoom in on drawings of components and even spare parts. A hot key link is usually available to jump back and forth between the graphics and equipment master. For infrastructure assets such as roads and parks, or facilities such as fire equipment, some CMMS packages may allow a reference to a GIS locator as well as the graphics link.

2. Perform a critical path analysis to determine criticality, as well as reactive, preventive & predictive maintenance requirements.

By carefully examining the end-to-end processes within operations, one can determine the criticality of each piece of equipment and its component parts. Critical inputs and outputs to the system should be identified, as well as its sub-systems. Then answer the following questions:

What does this component do?
Identify the points where a potential component or part failure would disrupt critical inputs/outputs.

What happens if it fails?
The answer will range from "catastrophic" to "negligible impact" and numerous choices in between such as "major safety, environmental or operational impact", "reduced efficiency" or "other financial loss (eg. quality problems)".

What maintenance program is required for this component?
If the impact of failure is negligible, then a reactive or "run to failure" program will most likely be cost-effective. On the other hand, a catastrophic impact may cost-justify the more expensive predictive maintenance program. For all other impacts, a cost/benefit analysis will determine the most economical maintenance program to adopt, ie, reactive, preventive or predictive.

Preventive and predictive tasks can then be defined to avoid or detect a failure. As well, the user can record corrective tasks required in the event of mechanical breakdown.

3. Define options for problem, cause, action and delay codes.

Coded fields greatly simplify data collection and force consistent reporting of failures by narrowing the choices. Descriptive fields are still available on the CMMS for more detailed explanations.

A problem code refers to how a breakdown is reported. For example, in a facilities operation, a tenant might report that a room is excessively hot or cold. A cause code is determined by the maintenance worker upon investigation of the problem. In the example above, possible causes may be: failed thermostat, blown circuit breaker, inoperative fan, and so on. The action taken can be codified in this example as repaired fan, reset circuit breaker or replaced thermostat.

A delay code explains why operations have temporarily ceased, such as awaiting raw materials, operator break, or product changeover.

4. Define options for work order and equipment status fields.

Status fields can be used to track the cycle time of various activities and delays. Work order status field options could be pending approval, waiting for arrival of parts, assigned to maintenance worker, etc. The equipment or component status field options might include commissioning, warranty repair, third party repair, and others.

5. Define performance measures linking operations and maintenance.

One effective way to focus the attention of both operations and maintenance departments on asset care is to show the relationship between equipment reliability and operations productivity. This can be accomplished through simple measures such as maintenance cost per unit of output, or operations cost per minute of equipment downtime. More important than the actual value of each measure is the trend over time.

6.Monitor the condition of the assets.

Condition monitoring is becoming an important feature of every CMMS. The simplest packages allow users to manually input data such as equipment usage for triggering PM routines. The more sophisticated CMMS is connected on-line to PLC’s for automated data collection. The software then analyzes incoming data to ensure that trends are within user-defined control limits. When data strays outside the control limit, users are "alarmed" and/or action is taken such as issuing a work order or paging the maintenance planner.

7. Conduct statistical analysis to indentify recurring problems.

Failures can be prioritized in terms of impact on safety, operations output, and cost. Use statistical analysis of equipment history to determine the high-frequency, high-impact problems and their underlying causes. Pareto analysis is one such tool.

Fishbone Diagramming or Cause & Effect Diagramming is then used to find the root cause. The CMMS can help link coded problem and cause occurrences with corrective action required. Various predictive and preventive maintenance tasks should be explored to prevent a problem from occurring in the first place. The analysis could also highlight the need for focused training of maintenance workers and/or operations personnel.

Studying the history of status codes may also provide valuable insight into how to improve asset reliability. Problems such as long lead times and inadequate authorization may suggest obvious corrective action. Additionally, the difference between cycle time (ie, elapsed time including delays) and touch time (actual hands-on productive time) highlights problems with the responsiveness of the maintenance department.

8. Perform repair/replace analysis.

Suppose in my earlier example that the problem "insufficient heat" has been caused by a failed thermostat in say, 80 percent of the cases reported in the equipment history file. The average cost of repairing the unit may have been $225 for parts and labour. Further analysis reveals that to replace all of the thermostats would cost only $125/unit.

Moreover, preventing failure would ensure that tenants are not left in the cold especially during extended cold spells. Thus, repair/replace decisions can be justified based on statistical analysis of equipment history.

9. Examine asset history to determine appropriate adjustments to user-defined variables.

The CMMS must be kept current and accurate. Some of the high-end packages analyze actuals compared with user-defined variables such as spare part lead times and safety stock level. Suppose, for example, a user had input a lead time of two days for a given spare part.

The system monitors this measure and reports an actual lead time of say, 10.5 days. The user could then adjust the lead time accordingly. Other variables that may require adjustment include PM frequency, control limits on equipment usage, equipment/component criticality, and so on.

10Establish rules-based diagnostics.

The most advanced CMMS packages use coded history to develop a knowledge or rules-based troubleshooting system for identifying the best course of action for a given problem. If, for example, a motor fails in a given piece of equipment, the diagnostic tool determines the statistical likelihood of each cause and suggests corresponding actions to consider.

Such a knowledge-based diagnostic tool could also be used for predicting failures in similar parts, components and equipment, once a pattern is determined. This would lead to monitoring the condition of key components that had not yet failed but were deemed statistically likely to do so, in order to catch a problem before it happens.

David Berger is with Western Management Consultants and is the founding president of the Plant Engineering and Maintenance Association of Canada. You can reach him at