MRO Magazine

How to Improve Long-term Reliability of Heavy Equipment

By Bryan Christiansen   

Industry Machine Building editor pick

The strategies to improve equipment reliability have always been a subject of key interest within the community of reliability engineers.

Photo: © andov / Adobe Stock

Mathematically speaking, reliability is a statistical term that indicates the probability of failure-free operation of any equipment in a given operating condition over its lifetime. Within any industry, equipment of various capacities and complexities could be present. While some of the equipment can be small and inexpensive, the others can be heavy and complex, requiring periodic intervention to sustain its performance. Therefore, to optimize the reliability of such heavy equipment, it is crucial that long-term planning is being carried out, which addresses the reliability over their entire life cycle.

Considerations for long-term reliability planning
Long-term planning for heavy equipment would involve careful analysis of different factors that can impact the operational effectiveness of the equipment. The factors would not just be reliability, but other parameters tied closely to it as well, such as maintainability, availability, and safety. Effective long-term reliability planning is indeed a life cycle planning that balances the cost of reliability improvement measures against the achievable benefits over the equipment life cycle.

For example, it would be effective to simply replace inexpensive and non-critical equipment upon failure rather than investing excessive man-hours to repair them. On the other hand, it may be worthwhile to implement proactive maintenance regimes for heavy equipment or even improve their design that enhances the reliability of future products. This is because the failure consequences associated with the heavy equipment significantly outweigh the cost of their reliability improvement measures. Following are some of the methods that can be adopted by any industry to perform long-term
reliability planning:

1. Adopting design for reliability (DfR) practices
The DfR is the selection of parts and the application of engineering design in such a way that reliability targets can be achieved even under worst-case operating conditions. It is a structured process to identify all possible sorts of reliability issues during the design phase and finding a solution to them before they are manifested during the operation stage. Different well-known analytical techniques are available. They systematically analyze failure and achieve DfR such as Failure Mode Effect Analysis (FMEA), Fault Tree Analysis (FTA), Finite Element Analysis (FEA), and reliability prediction modeling, etc. In general, DfR is a process that is comprised of six stages as follows:


a) Identification of reliability goals
b) Developing a draft design based on reliability goals
c) Analyzing the design to overcome uncertainties
d) Verifying the draft designs and continuously updating it
e) Validating the final design against customer requirements and continuously improving it.
f) Continuously monitoring the current design by obtaining feedback during the operational phase of
the equipment.

2. Implementing continuous improvement
The Continuous Improvement (CI) process is carried out when the equipment enters into its operational phase. The equipment generally has varying failure rates throughout its life and in turn, may have a different performance from infant mortality up to its wear-out period. The ISO 9001 defines this reliability improvement process as Plan Do Check Act (PDCA). The PDCA process can be a governance structure that stretches from establishing a culture of reliability to managing human, equipment and information.

a) Managing humans: It involves periodic training and developments to improve the competency of people operating the equipment. The purpose would be to avoid human-related failure in the equipment that compromises its reliability.

b) Managing equipment: It involves continually improving the design of the equipment such that it continues to meet or exceed the long-range reliability targets. The purpose would be to capture the information on design-related failure occurring during the operating phase of the equipment and channel it to the DfR team such that the future design can be improved.

c) Managing information: It involves continually improving the quality of equipment lift data, such as failure and maintainability data. The purpose would be to make the data suitable to perform necessary analytics and to support equipment reliability improvement decisions.

3. Implementing CMMS – a long-term reliability planning tool
A computerized Maintenance Management System (CMMS) is a software-based platform that can facilitate reliability engineers performing a range of planning activities. CMMS collects, stores and processes equipment failure and maintenance data and provides useful analytics on equipment health. Some CMMS platforms also have capabilities to interface with equipment sensors and other Internet of Things (IoT) devices to fetch real-time data about the equipment’s health and its surroundings.

The presence of historical as well as real-time sensor data within CMMS provides immense opportunities for planners to advance maintenance strategies from simple time-based maintenance to more proactive strategies, such as condition-based maintenance and predictive maintenance. Such proactive maintenance strategies provide a projected outlook of equipment performance over its useful life period – a crucial element of long-range life cycle analysis of the equipment.

4. Implementing parts kitting process
Parts kitting is a process to gather all tools and spare parts necessary to complete a repair, or perform maintenance on critical equipment. Often critical equipment have limited redundancies and also have specific time slots when they can be maintained or overhauled. They also have a high cost of downtime and have a significant impact on operational reliability as well as the safety of the plant. Such critical assets typically comprise a large number of subsystems. Each of the subsystems may have its own spare parts and specialized tools that are needed for repair and maintenance.

For example, a heavy reactor in an oil refinery requires an entire plant unit to be shut down before it can be maintained. Moreover, to sustain plant reliability and availability targets, it also allows only a short window of time before operation is required to be brought back to a normal state. To overcome this challenge of time constraints, parts kitting can be adopted to expedite the maintenance process that would otherwise take a significant time to mobilize all resources.

CMMS aside, none of this is simple to implement in practice, but such is the case with most long-term business activities. When done smartly, investing in long-term reliability of heavy equipment is bound to reap a strong return on investment. MRO
Bryan Christiansen is the Founder and CEO at Limble CMMS (a mobile CMMS software). He can be reached at


Stories continue below

Print this page