Problem Of The Month

December 2002---Reliability And Life Cycle Cost



You must know when to accept the financial risk and when to reject the financial risk based on reliability engineering calculations and life cycle costs.  Download this problem of the month as a PDF file (300KB) revised 01/18/03.

Reliability and money are a wonderful combination—one hand washes the other.  The problem of life cycle cost is to know when things fail so you can price-out the failure costs with an Excel spreadsheet for life cycle cost considering the time value of money in NPV calculations.  You find when things will failure by exercising the reliability calculations.  These chicken or egg problems are the reasons why engineers need training in Reliability Engineering Principles and Life Cycle Costs. 

The Problem:

We have a situation where we can expect one failure per year.  The consequence of failure is very high at US$20,000,000 (many situations of high consequence of failure exist with pipelines, refineries, chemical plants, shipping, aviation, transportation, and so forth.  We can mitigate the failure costs by installation of equipment to make the system more reliable—of course the capital expenditures cost money.  We must balance the costs of capital with the reduction of risks—the issue is to find the lowest long term cost of ownership.  The lowest long term cost of ownership involves net present value (NPV) decisions.  NPV calculations take into account the time value of money.  We want to mitigate the high cost of unreliability which means we don’t want to spend too much and we don’t want to spend too little over the life of the equipment.

Our default consequence example has an annual cost of US$20,000,000.  This is the default case for doing nothing to mitigate the failure.  On a risk based calculation, the probability of failure = 1.0.  Probability is a statement of unreliability.  Reliability + Unreliability = 1  from the complementary equation.

The amount of money at risk is $risk = (probability of failure)*($consequence of failure).  For the default case of one failure per year for taking no evasive action is 1*$20,000,000 = $20,000,000.  Taking this high risk requires no expenditure of capital.

Installing redundant instruments would decrease the probability of failure, i.e., reliability will increase as the unreliability decreases [reliability + unreliability = 1].  The reliability of each instrument is 95% based on a one year mission.  The instruments can be installed in parallel for increased reliability.  Installed instrument cost is US$10,000 per instrument. Based on a financial justification, how many devices should be installed considering reliability and financial consequences? 

Take these conditions for the spreadsheet calculations: discount rate = 12%, tax provision = 38%, project life = 20 years.

Reliability calculations:

The first instrument installed will function as if it is in series.  The second, third, fourth, and so forth instruments will function as if installed in parallel operation.  The parallel reliability calculation for the system is:
            Rsystem = 1- (1-reliablity)N
where N is the number of similar items in parallel. 

The probability of failure (a statement of unreliability)  is:
            UR = POF = 1-Rsystem = (1-reliablity)N. 
Table 1 shows the reliability calculations along with the probability for failure.  The financial exposure is calculated as (column 3)*($20,000,000).

The problem can be viewed two ways:
 1.  The financial calculation for each case can be converted into net present values driven only by the $Risk for each year and the capital costs which will be expended.  The option with the lowest calculated NPV is the winner.  In short, use the life cycle cost spreadsheet:
            A)  put the capital cost from Table 1, column 5 into cell D5
            B)  put the annual cost from Table 1, column 4 into cells E17:X17
            C)  read the NPV in cell C3 
ßSee Table 2, column 6 
The case with the least negative value is the winner.
2.  The default case is the most expensive and a saving can be calculated between the no instrument case and the alternatives along with the capital costs required for each alternative.  In short, use the life cycle cost spreadsheet:
            A)  put the capital cost from Table 1, column 5 into cell D5
            B)  put the annual cost saving ($20,000,000 - $Risk/yr) from Table 1, column 4 into cells E19:X19
            C)  read the NPV in cell C3 
ßSee Table 2, column 7 
The option with the largest positive NPV is the winner in this cost difference condition which examines the savings (of course it requires a datum case).

 Table 2 shows the calculations with both views on NPV.  The winner case is for installing four instruments which will operate in parallel.

You need the reliability values which drive the probability of failure.  The probability of failure multiplied by the exposure consequence is the amount of risk you must guard against each year.  Of course for higher reliability, the greater will be the capital equipment.  Using the NPV values you can find the conditions which represent the lowest long term cost of ownership at four devices.

The $Risk/year will go into the simple NPV calculations along with the capital expenditure to drive the negative NPV values.  The ($Risk/year from the datum case of no protective instruments less the $Risk/year for the different number of instruments) will be the annual savings for each capital expenditure will produce the positive NPV values—of course you can see the D values by taking the differences between the negative NPVs.

A little technology, a little money, a simple spreadsheet, considering the time value of money along with tax provisions, and so forth gives a short, sweet solution.  This is an illustration of how reliability and life cycle cost are a wonderful combination to achieve a decisive action plan for reducing financial exposure.  As John Ruston said over 100 years ago: “Its foolish to spend too much money but it’s unwise to spend too little.”  It’s foolish to take to little risk and it’s unwise to take too much risk—you get it right for the lowest long term cost of ownership by using NPV decisions and reliability principles.

This problem is described in “Reliability Issues From A Management Perspective” prepared for the 52nd API Pipeline conference, San Antonio, TX , April 18, 2001 which you can download as a PDF file from a list of technical papers on this website.

Reliability systems (series and parallel) –

When devices are placed in series, the reliability of the systems follows the computations shown in Figure 1.  Reliability for the system can plummet when many items are placed in series as system reliability is paced by the least reliable device in series. 

With a series system, when any link in the chain fails, the entire chain fails (remember to old fashion Christmas tree lights where failure of one light bulb resulted in the failure of the entire string).  The simple test for a series system is easy—take out items one-by-one and if the system fails, then the individual component was in series.

Many very long series systems exist and the system functions with high reliability by pushing individual reliabilities to high levels by extensive verification testing and operating the systems at low loads so that “freeboard” between loads and strengths is very large.  If loads and strengths do not overlap, then high reliability is obtained for each component to provide high system integrity. 

An example of a long series system is seen in long natural gas pipelines which lack enroute storage capabilities.  A long gas pipeline can have ~125 welded connection per mile and the pipelines can easily be 1500 miles long.  This represents 187,500 welded connections in series!  If the system needs an overall reliability of 90% the individual reliabilities must be Rs = 0.9 = R187500, or the reliability of the individual welded connections (assuming all reliability values to be the same), R = 0.9(1/187500) = 0.90.00000533 = 0.999999438 which is a spectacular value for individual components.

Parallel systems are shown in Figure 2.  The system survives if one or more items survive.  The system computation is 1 minus the product of the unreliability.

It only requires a few items in parallel to achieve high overall system reliabilities.

Contrast the system outcome between parallel systems in Figure 2 with the series systems in Figure 1.  The numbers speak for themselves. 

Figures 1 and 2 come from Reliability Engineering Principles training course, section 4 concerning reliability models.


Refer to the caveats on the Problem Of The Month Page about the limitations of the following solution. Maybe you have a better idea on how to solve the problem. Maybe you find where I've screwed-up the solution and you can point out my errors as you check my calculations. E-mail your comments, criticism, and corrections to: Paul Barringer by     clicking here.  Return to top of page.

Last revised 01/18/2003
© Barringer & Associates, Inc. 2001

Return to Barringer & Associates, Inc. homepage