Problem Of The Month
March 2009—How To Calculate Maintainability Values
Download this problem of the month as a PDF (178 K file size) file.
What Times Go Into Maintainability Issues?
Maintainability and how maintainability and reliability go together are explained in the July 2001 problem. This problem shows how to produce maintainability values for use in reliability and maintainability (RAM) models. The maintainability values obtained describe how long various blocks in a RAM model are down for repairs. Perhaps a more complete representation of the system downtime which drives a metric called time to restore. The longer restoration time represents the position of a buyer because it tells about longer delays (and thus greater expenses) of the equipment which includes:
time required for verification, diagnosis, and location of the fault problem,
time to locate the part requiring repair,
time assigned to logistic delays (external and internal) in procuring parts,
time to clear the system for maintenance in hazardous or sterile environments,
time to actually make the repair,
time to treat the system for return to hazardous or sterile environments,
time to verify the malfunction has been repaired and will performed successfully; and
time to validate the system is alive and well plus capable of performing the intended function.
The more complete representation of downtime includes a longer central tendency for the typical maintenance delay and greater scatter in individual downtimes than experience in the metric called time to repair which represents the position of a seller of the equipment because the numbers are smaller and easier to sell.
The time to restore drives an average value called MTTR (watch out here because the R is for restore) which is better described as MDT representing the average down time. The time to make drive the average value called MTTR (watch out again because the R is for the time to make the repair) and shorter repair time is also called the MTTR. What should you use for RAM models—well it depends if you’re buying or selling—in my view it should be the longer restore time as that will drive the real time clock and the real expenditures of cash. So we have a pitfall for the unwary that needs to be clarified before the fists start to fly!
Some things about failures and restoration to serviceable conditions are always “true”:
downtime is always too long and occurs at the most inopportune time,
repair costs are always too expensive,
consequence of downtime will costs more than are reasonably expected,
customers waiting on return to uptime status “know” they can accomplish the restoration more quickly than trained repair people; and
customers always believe nothing in their use of the device
has contributed to the downtime—they “know” they are the victims.
Murphy, the saboteur, makes sure the total downtime is longer than anyone thinks is reasonable. Of course, the way to avoid these unhappy downtime events is to avoid the failure by achieving reliability (not just talking about reliability but achieving it). No matter what maintainability logic is used, those who have never had to perform the task themselves will view the actual repair times as “too long”. For those who have performed the tasks, they will see the times as reasonable with allowances for the different skills levels for those doing the work.
Time required for maintenance actions is not a fixed value. The time varies by skill levels, access to replacement parts, diagnosis time, etc. Subsequent maintenance actions all occur with individually different values requiring:
central tendency values along with a
measure for scatter in the data.
The maintainability numbers require two things:
1. a central tendency values of the downtime:
mean (usually calculated), or
median (MuAL) usually from a probability plot;
2. a measure for scatter for the downtime
standard deviation (usually calculated), or
shape factor (SigF) usually from a probability plot.
So what’s the difference in the metrics other than the obvious central tendency/scatter and why? The answer is some have dimensions of time and some are dimensionless as will be explained below. Use of RAM modeling software can handle either value (for example, RAPTOR which you must declare under the options menue) while other RAM software is not so clear. If you use the wrong values, in the RAM software, you’ll get incorrect answers for your models.
Figure 1 shows data and how hand calculations are made. The data for Figure 1 comes from rounding the data in Table 8.1 from Chapter 8, pages 399-441 of Demitri Kececioglu’s Reliability Engineering Handbook, Volume 1, 1991, ISBN 0-13-772294-X. Making the hand calculations are tedious and the graphical method of Figure 2 are suggested.
It is important to distinguish between lognormal scientific values (dimensionless) or engineering values (with dimensions)—You can misuse the maintainability values and result in serious errors in your RAM models so you need to know how the model handles the input data for maintainability which you will supply for the RAM model!!!!
Figure 1-Hand Calculations
Probability Plots For The Statistics-
Figure 2 is a lognormal probability plot using the repair time data from Figure 3. The data make a good curve fit with a pve% goodness of fit of 97.64%. A valid curve fit would be any value 10% or above.
Figure 2 Lognormal Probability Plot
Notice the statistics reported in Figure 2 are in engineering units of repair time (minutes). The graphical median MuAL = 77.675 minutes [as calculated in Figure 1}. SigF = 2.098 minutes [which compares favorably to the antilog value of 1.956679 in the calculated value above] the reason for minor variation is due to the method of rank regression curve fit. Engineers highly value the fast graphic representation rather than the tedious calculations.
Figure 1 uses a calculation method whereas Figure 2 is derived from a curve fit of the data using Benard’s median rank plotting positions and rank regression techniques. Engineers usually prefer a graphical method whereas scientist usually prefer a calculation method.
Figure 3 shows a probability distribution function with the X-axis in time as recorded and the long tail to the right is obvious.
Figure 3 Lognormal PDF In Time As Recorded
Figure 4 shows the X-axis transformed into a logarithm scale of time as recorded. Notice how symmetrical the curve appears with the logarithmic X-axis transform.
Figure 4 Lognormal PDF With Logarithmic X-Axis Transform
Details about the lognormal equation are also described in The New Weibull Handbook 5th edition by Dr. Robert B. Abernethy.
Figure 1 shows details of the hand calculation. Figure 2 was made using SuperSMITH Weibull software. Using SuperSMITH Weibull software you can select different types of probability paper such as Weibull, lognormal, normal, Gumbel lower, Gumbel upper, Weibull 3-parameter, and lognormal 3-parameter paper to analyze different datasets and construct the probability plots.
For More Details On Maintainability See-
See the July ’01 Problem for more details on Maintainability.
For Further References On Maintainability
Finally, other details downloadable at no-charge concerning reliability/maintainability/availability theory are give in:
DOD Guide For Achieving Reliability, Availability, and Maintainability for a complete system concerning reliability and maintainability,
Test Evaluation of System Reliability, Availability, and Maintainability—A Primer,
Maintainability Design Techniques which includes some ergonomic details,
MIL-HDBK-338 concerning electronic reliability and maintainability,
MIL-HDBK-470 provides guidance to maintainability mangers and engineers,
MIL-HDBK-472 provides current maintainability prediction procedures,
MIL-HDBK-764 concerning system safety engineering design guide for maintainability engineering,
MIL-HDBK-2084 for maintainability of avionic and electronic systems and equipment,
MIL-STD-470 maintainability program for systems and equipment,
MIL-STD-721 definitions of terms for reliability and maintainability [for later details see MIL-HDBK-338],
NASA-STD-8729.1 for planning, developing and managing an effective reliability and maintainability program,
NASA-TM-4628 recommended techniques for effective maintainability,
NATO-ARMP-1 requirements for reliability and maintainability,
NATO-ARMP-4 guidance for writing NATO R&M requirements documents,
NATO-ARMP-5E guidance on reliability and maintainability training,
NATO-ARMP-6E reliability and maintainability in-service,
NATO-ARMP-7 NATO R&M terminology,
NATO-ARMP-8E reliability and maintainability for procurement of off-the-shelf equipment,
TM 5-698-5 reliability, availability, and maintainability characteristics of over 200 components,
UK-DefStan00-43-Part1-Issue-1 reliability and maintainability assurance activity; and
UK-DefStan00-44-Part2-Issue1 reliability and maintainability data collection.
Perhaps you begin to see the value of maintainability technology in the military and NASA applications with the large number of specifications.
Refer to the caveats on the Problem Of The Month Page about the limitations of the following solution. Maybe you have a better idea on how to solve the problem. Maybe you find where I've screwed-up the solution and you can point out my errors as you check my calculations. E-mail your comments, criticism, and corrections to: Paul Barringer by clicking here. Return to top of page.
Technical tools are only interesting toys for engineers until results are converted into a business solution involving money and time. Complete your analysis with a bottom line which converts $'s and time so you have answers that will interest your management team!