Equipment Reliability - a review of Inherent and Actual Reliability

1. Historical perspective of Reliability

An intuitive and subjective application of reliability concepts dates far back in history to the year 1705 when Denis Papin built a boat powered by hand-cranked paddles (David, W. The Invention of Science. New York: Harper Collins, 498-504. 2015). In 1736, Jonathan Hulls was granted a patent for a Newcomen engine-powered steamboat, a design which was later improved by James Watt. The first steam powered ship called the Pryoscaphe, a paddle steamer fitted with a Newcomen engine, was built in France in 1783 by Marquis Claude de Jouffroy and his colleagues as an improvement to the 1776 Palmipede design. To improve the earlier design, Marquis incorporated the concept of redundancy on the steam engine ship by incorporating mast sails. With reliability improvements in steam engine reliability over time, redundancy was abolished. The development of twin engine aircraft during and after the First World War (1914 - 1918) was an attempt to improve the reliability of the aircraft engines of that time.
Reliability engineering also was applied on the German side during the Second World War, in connection with flying bombs and early ballistic rocket weapons, V1 and V2. The absence of a pilot to adjust changed the situation from one skill to one of probabilities. Maintained systems were not considered because these weapons did not come back. The dominant focus at that time was reliability rather than system maintainability or availability. As a discipline, reliability engineering emerged because of the American military and spares programme during the 60’s and 70’s when very complex, expensive equipment had to be made to perform reliably.

2. Reliability and Failure – an overview

Reliability is defined as the probability that a product, component, equipment, or a system will perform as required for a stated period under stated and environmental conditions. It is determined by the decisions made during the pre-production phase and the production stage of the equipment life cycle. The achievement of reliability can be impacted by two things: functional and reliability failures. Failures can be classified in three different ways: as to cause, as to suddenness and as to degree. Functional failure is an inability of a design to meet or function at a level of performance that is deemed satisfactory to the user. A complete loss of function also qualifies as a functional failure. Reliability failure, on the other hand, occurs after some period of use and can be due to design errors, faulty material, manufacturing, assembly, and commission errors (infant mortality period), human errors during operation (constant failure period), or material fatigue (wear-out period), as shown in Figure 1.

The other point to know about reliability is that a period is mentioned. Another important concept to understand is the concept of failure rate, defined as follows: Failure rate = (number failing per unit time ‘at time t’ / number surviving ‘at time t’). The failure rate Ø (t) is the rate at which failures occur in the interval t1 to t2 and is defined as:

(t)=R(t1)-R(t2)(t2-t1) R(t1)

An example of a failure rate curve (“bathtub”) curve is shown in Figure 1

Figure 1: Bathtub curve

In this discussion piece we will not cover the detailed reliability mathematics.

3. Reliability - an overview

Reliability is concerned with the interdisciplinary use of probability, statistics, and stochastic modelling, embedded with engineering insights into the design and the scientific understanding of failure mechanisms. Design reliability depends on the reliability specifications at the maintainable component level. Our focus in this article is on inherent and actual reliability. We will start with definitions, reliability influences, and what can you do to optimise them to attain the highest possible reliability of your products, equipment, or systems. We cover only maintainable components, equipment / systems, i.e., those that are repairable and returned to use, and not on non-maintainable components, equipment / systems which are usually run to failure. We are discussing the green bubble (01) in figure 2.

Figure 2: Components and objectives of equipment reliability

Apart from the aerospace industry, much work on reliability has been done in the electronics industry and nuclear power industry.

3.1 Inherent Reliability

The design reliability depends on reliability specification at the maintainable component level. Inherent reliability, therefore, refers to the maximum achievable level of reliability that is established through the design choices and manufacturing processes of your piece of equipment or system. Inherent reliability is influenced by various factors along the asset’s life cycle stages, as shown in Figure 3.

  • Specification stage - accounts for your business objectives and equipment reliability specifications (specified by reliability engineers or design engineers to the designer).
  • Design stage - the designer’s choice of parts, expected loads & operating environment, component configuration (series or parallel), reliability modelling, reliability testing (accelerated life testing), reliability engineering (design FMEAs) and reliability science (properties of materials and causes for deterioration that can lead to part or component failures).
  • Manufacturing stage - manufacturing equipment and process quality, process controls, inspections, and quality assurance checks, etc.
  • Transportation - how components or assemblies are protected during transportation against vibrations, impact, false brinelling, contamination ingress, etc. impact the actual reliability that will be realised once commissioned.
  • Storage / Environment - how equipment is stored once on site, whether its exposure to harsh environmental elements for extended period can initiate the destructive cycle of damage, etc.

Figure 3: Factors influencing inherent reliability.

Inherent reliability is recognized as a design attribute - meaning you cannot increase it no matter how much maintenance or inspection you perform on your equipment or system after it has been installed and operated on. It can be thought of as your equipment’s or system’s overall upper limit of reliability and availability.

3.1.1. How to increase inherent reliability?

Reliability growth above the initial inherent reliability is only possible through design changes. It may be possible to achieve some reliability growth through optimisation of your maintenance programme, however, these improvements will only assist your product, equipment, or system to achieve its design reliability. Reliability growth monitoring is an area of growing importance in the reliability space.
What this means is that during the development programme of a new piece of equipment it is of great help to management to be able to see how reliability is improving as modifications are incorporated during the test programme.

There are available optimisation tools for reliability growth modelling, such as Duane and Crow-AMSAA models. Duane model is the result of the work of J. T Duane who derived an empirical relationship based upon observation of the mean time between failure (MTBF) improvement of a range of items used on aircraft. Duane observed that the cumulative MTBF / Øc (total time divided by total failures) plotted against total time on log-log paper, gave a straight line. The slope (α) gave an indication of reliability (MTBF) growth.

log Øc = log Øo + log (log T - log T0) …….. where log Øo = cumulative MTBF at the start of the monitoring period To.

An example of Duane plot is shown in Figure 4.

Figure 4: Sample Duane plot for reliability growth modelling

The Duane model is applicable to a population with several failure modes, which are progressively corrected, and in which several items contribute different running times to the total time. Therefore, this model it is not appropriate for monitoring early development testing, and it is common for early test results to show a poor fit to the Duane model. Knowledge of the effectiveness of past reliability improvements initiated can provide a guidance in selecting a value for the slope (α).

A specification of reliability, availability, and maintainability (RAM) requirements at the design stage is key to inherent reliability improvements. Other requirements should include design FMECA, Fault Tree Analysis (FTA) and Maintainability analysis.

3.2. Actual Reliability

Actual reliability is the reliability achieved during actual operation (operational reliability) and it can differ from its design (inherent) reliability due to various common cause failures such as design (component selection) and human errors (assembly, commission, operation, and maintenance) and operating environmental differences from what was considered during the actual design phase. The resulting reliability is what’s termed the actual (achieved) reliability in operations. Common cause failures are often a difficult issue in the solution of system reliability problems. The benefits of redundancy can be completely negated by common cause failures, i.e., simultaneous failures of all redundant units at once. The reasons for common cause failures are numerous, including the ones shown in Figure 4.

Figure 4: Typical common cause failures impacting achieved reliability.

3.2.1. How to increase inherent reliability?

Maintenance and reliability professionals are faced with the daunting task of dealing with someone else’s design – whether good or bad. When design is finished, construction starts and finishes, and the plant is commissioned. Rarely is the plant delivered to the maintenance department with a comprehensive and well-documented maintenance requirements analysis and a maintenance plan. Consequently, as the reliability engineer you are left to second guess the design intent, the plant limitations, the potential failure modes, and the likely consequences of them.

The operations people are, at the same time, learning how to operate the plant and experimenting with it – pushing it to the limits and occasionally well over its design intent. There is also budget constraints or time to change obvious design or maintainability problems in the new plant.

In best practice organisations, a fully documented reliability centered maintenance (RCM) based maintenance programme is developed through the design phase. Unfortunately, in most capital projects across several industries, any reliability engineering or failure analysis is done in an informal manner and certainly not provided to the maintenance department for use in developing asset management strategies and policies.

What are the key takeaways to drive operational reliability?

  • Standardise components used.
  • Understand the physics of failure.
  • Design for reliability.
  • Design for maintainability.
  • Reliability analysis and testing as an integral part of equipment specification.
  • Where safety, environmental and production critical - specify additional reliability requirements e.g., FMEAs, FTAs, etc.

We at Tau SJV, as part of our reliability consulting have the tools to conduct in-depth reliability analysis to assist with design reviews in the quest to attain a fit-for-purpose design of optimized reliability to meet your requirements. For any queries or requirement for assistance with the demo of the proprietary reliability analysis tool, please feel free to reach out to us at info@tausjv.com and we will gladly assist you.

Please follow our LinkedIn page, https://www.linkedin.com/company/tau-sjv and click on Newsletter to receive up to date posts such as these.