Utility Asset Risk Management With RCM
By JAMES REYES-PICKNELL AND ALEXANDER BAKULEV
April 29, 2019
By JAMES REYES-PICKNELL AND ALEXANDER BAKULEV
Power utilities can use RCM as a foundation for good asset management.
Our lives depend heavily on electricity and its supply is highly reliable, so much so that we rarely think about it, until the lights go out. Fortunately, that is a fairly rare occurrence. The design of many cities’ distribution networks allows for some redundancy of supply, particularly in areas where there tends to be a higher concentration of hospitals, universities, government and private sector offices, and dense shopping, to name a few.
In North America, the states and provinces began introducing asset management principles with its laws and regulations governing its application to public electrical or civil infrastructure. Forward-looking utilities foresaw this trend and began introducing good asset management practices over the past several years. These utilities and their municipalities are continually improving, while meeting regulatory requirements to keep costs down. Their engineers understand that it is less expensive to run a reliable asset than it is to allow it to become unreliable.
The international standards for asset management (ISO 55000, 55001, and 55002) make reference to managing risks. The standards are voluntary and the province does not explicitly require compliance, but it is strongly implied that any organization under the province’s mandate needs a good asset risk management program. At the heart of that is the identification of all sorts of risk. In an electric distribution utility, many risks are directly attributable to asset failures at the level of specific individual failure modes. That requires an in-depth look at how the many assets being managed can fail.
Some utilities experimented with reliability centred maintenance (RCM) in the 1990s. First attempts were not entirely successful and the method, as it was applied proved cumbersome. A renewed attempt at a North American utility that the authors worked with, in the early 2000s took a fresh approach, carried out training of staff and facilitators, did a series of pilot projects to prove the concept, got excellent results, embraced the method, and used it on all of their major asset classes.
That utility eliminated unnecessary maintenance that could be shown to be increasing failures, increased preventive and predictive maintenance where none existed, and introduced various testing programs and asset and system redesigns. The developed program was input into their computerized maintenance management system and executed on schedule, achieving increased levels of achieved reliability and lower costs.
In RCM, the utility identified how assets could fail, the consequences of those failures, and what best to do to avoid, minimize, or mitigate those failures and/or consequences. Risks were identified and to a large extent quantified using knowledge from the analysis participants, records of past failures and studies, and data from maintenance management systems. It was a fact-based decision process aimed at managing risks arising in and from their vast fleet of physical assets delivering electricity.
Once the analyses were completed and the maintenance programs initiated, operations remained stable and asset performance was mainly as expected. The utility had added data fields to its maintenance management software system to capture needed data for continuous improvement purposes. It was to be used to facilitate better reliability analysis and provide better evidence on which to make future RCM decisions.
It worked. As several years later, some assets were noticed to be failing more than had been experienced before. This revealed a trend; it was beginning to appear that some of the benefits of RCM were being eroded away. Why? The short answer is that the utility’s “operating context” had changed. Operating context describes the assets and how they are used, and how they matter to the utility and its customers. There are several contextual factors that have changed since that initial RCM effort.
Since the initial RCM effort, the electrical infrastructure, mostly built from the 1950s to the 1980s, started to surpass its useful life limits. With limited funding available for renewal, power reliability had showed a sharp negative trend. The aged assets revealed new failure modes not captured in the original analysis.
Fighting the aging trend, the utility introduced a new approach to asset management, mostly relying on the PAS 55 standard (PAS 55-1, Publicly Available Specification, “Specification for the Optimized Management of Physical Assets,” 2004, British Standards Institute), which included asset condition and risk assessment. With that improved understanding of the end-of-life condition, the utility developed rigorous asset management plans aimed to curtail a negative trend. They increased execution workforce and commenced an infrastructure renewal program, introducing new assets into the system.
As with many manufactured products today, the newer assets are highly engineered to minimize weight and cost while delivering higher capacities. Consequently, those assets can be more highly stressed than the older assets they replace, have different failure modes, and may also suffer increased rates of some failures.
The territory the utility operates within has grown, adding to its population, increasingly housed in high-rise complexes. The utility expanded its network, built new stations, and upgraded capacity where demand increased. There are new assets to maintain, and some are new designs, unknown to the utility’s workforce. For example, new SF6 gas-insulated breakers and transformers, part of a new substation, were analyzed using RCM. Both assets were relatively new designs, and although there are hundreds installed worldwide, none are particularly old, so operating and maintenance history is scarce.
No other utilities that we know of had carried out RCM on these assets.
Both analyses were supported by equipment manufacturers and in both cases failure modes were identified that were surprising to the manufacturers. Manufacturers don’t operate and maintain their own assets, so unsurprisingly to the analysts, the manufacturers didn’t know as much about their products as one might expect. While the assets appear to be more complex on the surface, their new maintenance programs, defined using RCM, are expected to result in long uninterrupted service lives, and improvement on the experience with more conventional designs now in service. In those cases, the manufacturers gained new ideas about maintenance that they will be recommending to their customers.
Climate change is also having an effect, particularly in lower-lying, flood-prone areas of the city. Underground apparatus is vulnerable to flooding and the utility experiences increased failure density localized in those areas. Summer thundershowers are more severe with increased flooding and lightning strikes. Other utilities are experiencing increased spring flooding due to river ice breakup. These changing weather patterns lead to flood-related failures in cable chambers. Discussions have arisen, as to whether or not failures caused by blocked drains should be treated as hidden or evident, and how best to test for them.
On their own, each of these factors may have a minor effect. Together they mean that assets behave differently and new failure modes need to be investigated with a proper maintenance plan put in place. Increased incidents of failures that were not expected based on the original RCM due to aging, new assets, localized load demand growth, and climate change have driven the need to revisit initial RCM-based programs.
Subsequent to the initial RCM effort, the utility carried out RCM training to keep skills alive; however, many engineers retired and increased the risk of knowledge drain from the company. Eventually it was necessary to carry out courses to educate recently hired engineers and re-analyze many of the assets.
The old analyses were stored in a dedicated RCM software system that is no longer supported, rendering them difficult to access. Modifying the original analyses would therefore be difficult, so the utility is taking a fresh look. This results in changes to the prescribed failure management policies that arose from the earlier initiative. The maintenance management system has more data, much of it is more accurate, and it is proving to be more useful than what was available to the original analysis teams. The utility also has a cadre of younger engineers with excellent data inquiry and analysis skills, so the data being gathered, which is taking advantage of fields added years before, can be put to good use.
The quality of the original RCM analyses depended on the quality of facilitation, analysis team understanding, and correct application of the RCM method. That is still true today; however, to ensure that asset managers are indeed doing the right things in looking after its assets, they are now using an RCM method that has been independently certified to comply with the RCM standard, SAE JA-1011.
They are also carrying out independent reviews of each analysis. Those reviews ensure the analysis follows the method and the guidance for using JA-1011 that is provided in standard SAE JA-1012. These independent reviews have been thorough. Most have resulted in each analysis undergoing several iterations before being deemed fully compliant. All iterations are well documented, including remarks, responses, and corrections. For each analysis, once it has passed that review, a certifying letter is issued.
It’s all part of a rigorous asset management practice, assurance of solid practice through verification and audits, continuous improvement recognizing a change, and using feedback loops from field actuals to maintenance planning.
RCM is an excellent way to gather evidence both for making good decisions about failure management strategies and for showing that they are indeed making good decisions. It also helps to demonstrate that the utility is spending wisely in looking after its assets, so that rates are kept as low as possible.
The utility has a number of good asset management practices in place and has had them for years. They manage risks and use RCM to identify those arising from failure of their fleet of electrical distribution assets.
Additionally, they use RCM’s very sound logic to determine what to do to mitigate, eliminate, or otherwise manage the consequences of risk deemed reasonably likely to occur at some point. RCM has stood the test of time, and is providing a very solid foundation on which to base failure management policies and the resourcing requirements to achieve them.
James Reyes-Picknell, PEng is Principal Consultant of Conscious Asset, providing business consulting and training services in Physical Asset / Maintenance Management and Reliability. He is author of several books, including Reliability Centered Maintenance – Reengineered in 2017 and Uptime – Strategies for Excellence in Maintenance Management, 2015.
Alexander Bakulev is CEO of METSCO Energy Solutions. He has contributed his extensive utility experience to a variety of projects in asset management, lifecycle optimization, risk profiling, and regulatory submissions for large hydro generation, transmission, and distribution utilities. He co-authored several publications and research papers for IEEE, CIGRE, and CEATI, on asset management and risk-based optimization.
He has made numerous presentations at industry conferences, educational courses, and workshops, and has provided expert opinion in rate filing regulatory proceedings. Bakulev holds a PhD in Economics.