Problem Of The Month

July 2001—Maintainability

 

Download this problem of the month as a PDF (178 K file size) file.

 

What’s Maintainability?

Maintainability is described in MIL-HDBK-470A (5.6 Meg PDF file) dated 4 August 1997, “Designing and Developing Maintainable Products and Systems” as:
 

“The relative ease and economy of time and resources with which an item can be retained in, or restored to, a specified condition when maintenance is performed by personnel having specified skill levels, using prescribed procedures and resources, at each prescribed level of maintenance and repair.  In this context, it is a function of design.”

 

Why focus on maintainability is described in AMSAA’s Guide To Maintainability which suggests roughly 38% of the life cycle cost is directed toward maintainability issues

 

Design for maintainability requires a product that is:

            serviceable (must be easily repaired) and
            supportable (must be cost-effectively kept in or restored to a usable condition).

Better yet if the design includes a durability feature called:

            reliability (absence of failures) then you can have the best of all worlds. 

 

Supportability has a design subset involving:

            testability (a design characteristic that allows verification of the status to be determined and faults within the item to be isolated in a timely and effective manner such as can occur with build-in-test equipment (BIT) so the new item can demonstrate it’s status (operable, inoperable, or degraded) and similar conditions for routine trouble shooting and verification the equipment has been restored to useful condition following maintenance).

 

Maintainability is primarily a design parameter. 

            The design for maintainability defines how long equipment will be down and unavailable. 

Yes, you can reduce the amount of time spent by having a highly trained workforce and a responsive supply system, which paces the speed of maintenance to achieve minimum downtimes. 

            Unavailability occurs when the equipment is down for periodic maintenance and for repairs. 

Unreliability is associated with failures of the system—the failures can be associated with planned outages or unplanned outages. 

 

Maintainability has true design characteristic.  Attempts to improve the inherent maintainability of a product/item after the design is frozen is usually expensive, inefficient, and ineffective as demonstrated so often in manufacturing plants when the first maintenance effort requires the use of a cutting torch to access the item requiring replacement.

 

Poor maintainability results in equipment, which is:

            unavailable,

            expensive for the cost of unreliability, and

            results in an irritable state of conditions for all parties who touch the equipment or have responsibility for the equipment.

 

How Do Maintainability And Reliability Go Together?

Reliability and maintainability are considered complementary disciplines from the inherent availability equation.  Inherent availability looks at availability from a design perspective:
 

Ai = MTBF/(MTBF+MTTR). 

If mean time between failure or mean time to failure is very large, compared to the mean time to repair or mean time to replace, then you will see:

            high availability. 

Likewise if mean time to repair or replace is miniscule, then availability will be:

            high. 

As reliability decreases (i.e., MTTF becomes smaller), better maintainability (i.e., shorter MTTR) is needed to achieve the same availability.  Please note:

            You cannot repair yourself to happiness!

Happiness occurs by not having failures that require maintenance activities to renew or restore the system to it’s state of availability.

 

In commercial operations, downtime for maintenance incurs two expensive activities:

1.      High costs for people and supplies to make the repairs

2.      High costs for lost profit opportunities.

For military operations, downtime for maintenance incurs two expensive activities:
  1.  Equipment down for maintenance is equipment that cannot defend against an enemy when challenged
       —spell that as a target for destruction which involves a very high expense and loss of life

            2.  High costs for the logistic tail to get spare parts to the battle field and make the repairs while the equipment is subject to destruction by the enemy.

When reliability increases then maintainability is not so important to achieve the same availability (for example, if maintenance to restore a failure is required every 50 years then availability is very high).  Thus tradeoffs can be made between reliability and amenability to achieve the same availability.  In fact the two disciplines of reliability and maintainability must work hand-in-hand to achieve the objectives.  Ai is the largest inherent availability value you can observe if you never had any system abuses.

 

In the operational world we talk of the operational availability equation.  Operational availability looks at availability by collecting all of the abuses in a practical system

 

Ao = MTBM/(MTBM+MDT).

 

The mean time between maintenance includes all corrective and preventive actions (compared to MTBF which only accounts for failures).  The mean down time includes all time associated with the system being down for corrective maintenance (CM) including delays (compared to MTTR which only addresses repair time) including self imposed downtime for preventive maintenance (PM) although it is preferred to perform most PM actions while the equipment is operating. 

 

Ao is a smaller availability number than Ai because of naturally occurring abuses when you shoot yourself in the foot.  The uptime and downtime concepts are explained in Figure 1 for constant values of availability.  Figure 1 shows the difficulty of increasing availability from 99% to 99.9% (increase MTBM by one order of magnitude or decrease MDT by one order of magnitude) compared to improving availability from 85% to 90% (requires improving MTBM by less than ~½ order of magnitude or decrease MDT by ~¾ order of magnitude).  The log-log relationships make explanations difficult about the relationships needed to achieve/maintain a given level of operational availability.

 

 


 

Operational availability includes issues associated with:

1.      inherent design,

2.      availability of maintenance personnel,

3.      availability of spare parts,

4.      maintenance policy, and

5.      a host of other non-design issues (whereas inherent availability addresses only the inherent design)—in short, all the abuses! 

6.      Testability, the subset of maintainability/supportability, enters strongly into the MDT portion of the equation to clearly identify the status of an item so as to know if a fault exists and to determine if the item is dead, alive, or deteriorated—these issues always affect affordability issues.  

Operational availability depends upon operational maintainability which includes factors totally outside of the design environment such as:

1.      insufficient number of spare parts,

2.      slow procurement of equipment,

3.      poorly trained maintenance personnel,

4.      lack of proper tools and procedures to perform the maintenance actions. 

Achieving excellent operational maintainability requires:

1.      sound planning,

2.      effective engineering design and test verification,

3.      excellent manufacturing conformance,

4.      adequate support system [logistics] for spare parts,

5.      knowledgeable people as operators and maintainers with effective training, and

6.      incorporation of lessons learned from previous or similar equipment.

Notice how these features go together with a modern automobile equipped with one [or more] control modules so that when maintenance/service is required the first step is to interrogate the onboard computer for faults to diagnose both the obvious and hidden flaws for corrective action.  This shortens repair time and incorporates prognostics (see AMSAA-TR-736 and AMSAA-TR-2006-4) into active repair decisions to improve availability and reduce maintainability actions.

 

How To Quantify Repair/Down Times?

For maintainability issues, the lognormal distribution is frequently employed to represent repair/down delays.  Roughly 85% to 95% of repair times are adequately represented by the lognormal distribution so you should naturally gravitate to lognormal distributions to reflect an adequate model for reliability/availability/maintainability (RAM) models.  The lognormal distribution has long tails to the right.  My suggestion is to generally use the lognormal distribution for repair/down times rather than splitting hairs by chasing other distributions—in short use the KISS principle (Keep It Simple, Stupid!).

 

The lognormal distribution says you should expect many repairs are completed in a small time interval.  However, the data also says to expect many repairs will also have long times.  The short time/long time data makes everyone crazy in trying to explain maintainability issues. 

 

Figure 2 shows 10 repair times for three different skill levels of maintenance individuals and the median (MuAL) repair times give in traditional units of time are shown along with the shape factor (SigF), in hours.  Given a standard repair time of 14.5 hours, the maintainability values for the Journeyman maintenance person is 88.4% (88.4% of all repair times are expected to be completed in 13.5 hours).  The First Class rated maintenance person demonstrates a maintainability of 60.5%.  The Apprentice demonstrates maintainability of only 37.6%.  By the way, this is data from a very good shop with small shape factors indicating very close control of repair times. 

 

Typical lognormal shape factors (SigF), in hours, describes the scatter in the data are:

1 =  perfection with all repair times the same value (this only happens in your fantasy world!),

2 =  typical shape factor for a good control on repair times,

3 =  shape factor when repair times are a little unruly, and

4 =  these shape factor are used for really badly controlled repair conditions when repair times are governed by Keystone Cops (meaning lots of activities but a long time to achieve measureable results).

The smaller the SigF, the less scatter you should expect in the data and the shorter the tails on the longer repair/down times.

Using the median rank values (MuAL), in hours, from the “family of curves”, the First Class repair person takes about 20% longer to complete the tasks than does a Journeyman, and the Apprentice takes about 20% longer to complete the tasks than does a First Class repairperson (or you can say the Apprentice takes 50% longer to make the repair than the Journeyman).  Another frequently used single point estimate is the median time to make repairs and the values are Journeyman = 10.3 hours, First Class = 13.0 hours, and Apprentice = 15.8 hours. 

 

Data (in rank order) used for Figure 2 is:
  Journeyman: 7, 7.7, 8.5, 8.9, 9.7, 10.5, 10.9, 11.5, 12.7, 14.9 hours
  First Class: 7.7, 8.8, 11, 11.6, 12.2, 12.7, 13.9, 14.5, 16.5, 20 hours
  Apprentice: 8.9, 11.5, 13, 13.6, 14, 14.5, 15.7, 17.5, 22, 25.5 hours

 


Details about the lognormal equation are described in  The New Weibull Handbook 4th edition by Dr. Robert B. Abernethy. 

 

Figure 1 was made using SuperSMITH Visual plotting software.  Figure 2 and 4 were made by using SuperSMITH Weibull software [you can select different types of probability paper such as Weibull, lognormal, normal, Gumbel lower, Gumbel upper, Weibull 3-parameter, and lognormal 3-parameter paper to analyze different datasets] to construct the probability plots.

No matter what the logic is used, those who have never had to perform the task themselves will view the actual repair times as “too long”.  For those who have performed the tasks, they will see the times as reasonable with allowances for the different skills levels for those doing the work.

For More Details On Maintainability See-
See the March ’09 Problem for more details on How To Calculate Maintainability Values manually and by use of lognormal probability plots.

For Further References On Maintainability See-
Finally, other details downloadable at no-charge concerning reliability/maintainability/availability theory are give in:
            DOD Guide For Achieving Reliability, Availability, and Maintainability for a complete system concerning reliability and maintainability,
            Test  Evaluation of System Reliability, Availability, and Maintainability—A Primer,
            Maintainability Design Techniques which includes some ergonomic details,
            MIL-HDBK-338 concerning electronic reliability and maintainability,
            MIL-HDBK-470 provides guidance to maintainability mangers and engineers,
            MIL-HDBK-472 provides current maintainability prediction procedures,
            MIL-HDBK-764 concerning system safety engineering design guide for maintainability engineering,
            MIL-HDBK-2084 for maintainability of avionic and electronic systems and equipment,
            MIL-STD-470 maintainability program for systems and equipment,
            MIL-STD-721 definitions of terms for reliability and maintainability [for later details see MIL-HDBK-338],
            NASA-STD-8729.1 for planning, developing and managing an effective reliability and maintainability program,
            NASA-TM-4628 recommended techniques for effective maintainability,
            NATO-ARMP-1 requirements for reliability and maintainability,
            NATO-ARMP-4 guidance for writing NATO R&M requirements documents,
            NATO-ARMP-5E guidance on reliability and maintainability training,
            NATO-ARMP-6E reliability and maintainability in-service,
            NATO-ARMP-7 NATO R&M terminology,
            NATO-ARMP-8E reliability and maintainability for procurement of off-the-shelf equipment,
            TM 5-698-5 reliability, availability, and maintainability characteristics of over 200 components,
            UK-DefStan00-43-Part1-Issue-1 reliability and maintainability assurance activity; and
            UK-DefStan00-44-Part2-Issue1 reliability and maintainability data collection.
Perhaps you begin to see the value of maintainability technology in the military and NASA applications with the large number of specifications.

Comments:
Refer to the caveats on the Problem Of The Month Page about the limitations of the following solution. Maybe you have a better idea on how to solve the problem. Maybe you find where I've screwed-up the solution and you can point out my errors as you check my calculations. E-mail your comments, criticism, and corrections to: Paul Barringer by     clicking here.   Return to top of page.

Technical tools are only interesting toys for engineers until results are converted into a business solution involving money and time. Complete your analysis with a bottom line which converts $'s and time so you have answers that will interest your management team!

Last revised 02/17/2009
© Barringer & Associates, Inc. 2001

Return to Barringer & Associates, Inc. homepage