Availability ¹
Reliability |
Availability tells information about how you use time. Reliability tells information about the failure-free interval. Both are described in % values.
Availability IS NOT equal to reliability except in the fantasy world of no downtime and no failures!
Availability, in the simplest form, is:
A = Uptime/(Uptime + Downtime) .
The denominator of the availability equation contains at least 8760 hours (it is not 8760 hours less 10% for screw-around time!) when the results are studied for annual time frame.
Inherent availability looks at availability from a design
perspective:
A_{i} =
MTBF/(MTBF+MTTR).
If mean time between failure (MTBF) or mean time to failure (MTTF) is very large compared to the mean time to repair (MTTR) or mean time to replace (MTTR), then you will see high availability. Likewise if mean time to repair or replace is miniscule, then availability will be high. As reliability decreases (i.e., MTTF becomes smaller), better maintainability (i.e., shorter MTTR) is needed to achieve the same availability. Of course as reliability increases then maintainability is not so important to achieve the same availability. Thus tradeoffs can be made between reliability and amenability to achieve the same availability and thus the two disciplines must work hand-in-hand to achieve the objectives. A_{i} is the largest availability value you can observe if you never had any system abuses.
In the operational world we talk of the operational availability equation. Operational availability looks at availability by collecting all of the abuses in a practical system
A_{o}
= MTBM/(MTBM+MDT).
The mean time between maintenance (MTBM) includes all corrective and preventive actions (compared to MTBF which only accounts for failures). The mean down time (MDT) includes all time associated with the system being down for corrective maintenance (CM) including delays (compared to MTTR which only addresses repair time) including self imposed downtime for preventive maintenance (PM) although it is preferred to perform most PM actions while the equipment is operating. A_{o} is a smaller availability number than A_{i} because of naturally occurring abuses when you shoot yourself in the foot. Other details and aspects of availability are explained in the July 2001 problem of the month.
Here is a table of availabilities for a one-year mission from Reliability Engineering Principles training manual. The data shows time lost (remember you are seeing equipment advertised with 99.999% availability—wonder how many failures you should expect during a one-year mission?):
Return to the top of this page.
Reliability, in the simplest form, is described by the exponential distribution (Lusser’s equation), which describes random failures
:
R = e^[-(l*t)] =
e^[-(t/Q)] = e^(-N)
Where t = mission time (1 day, 1 week, 1 month, 1 year, etc which you must determine). l = failure rate, Q = 1/l = mean time to failure or mean time between failures, and N = number of failures during the mission—note the number of failures during a mission can be a fractional value. Notice that reliability must have a dimension of mission time for calculating the results (watch out for advertising opportunities by presenting the facts based on shorter mission times than practical!!!) . When in doubt about the failure mode, start with the exponential distribution because
1) It doesn’t take much information to find a reliability value,
2) It adequately represents complex systems comprising many failure modes/mechanisms, and
3) You rarely have to explain any complexities when talking with other people.
When the MTTF or MTBF or MTBM is long compared to the mission time, you will see reliability (i.e., few chances for failure). When the MTTF or MTBF or MTBM is short compared to the mission time, you will see unreliability (i.e., many chances for failure).
A more useful definition of reliability is found by use of the Weibull distribution
R = e^[-(t/h)^b]
Where t = mission time (1 day, 1 week, 1 month, 1 year, etc). h = characteristic age-to-failure rate, and b = Weibull shape factor where for components b <1 implies infant mortality failure modes b =1 implies chance failure modes, and b >1 implies wear out failure modes. If you’re evaluating a system, then the beta values have no physical relationship to failure modes.
How do you find h and b? Take age-to-failure data for a specific failure mode (this requires that you know the time origin, you have some system for measuring the passage of time, and the definition of failure must be clear as described in the New Weibull Handbook), input the data into WinSMITH Weibull software, and the software returns values for h and b. Please note that if you have suspended (censored) data, then you must signify to the software the data is suspended (simply put a minus sign in-front of the age and the software will tread the data as a suspension).
Here is a table of reliabilities for a one-year mission from Reliability Engineering Principles training manual that shows the failures per time interval (remember you are seeing equipment advertised as “reliable” and you should ask the mission time to which the reliability applies or else you may get some real surprises when you find the calculation is for a and impressive reliability for a 24 hour mission and you expect the equipment to be in continuous service for five years!!):
Return to the top of this page.
You can take the components, define the failure characteristics and put the components into a system. For the system, you can model the availability and reliability by use of Monte Carlo models with programs such as RAPTOR . RAPTOR will require maintainability details, which are described in the July 2001 problem of the month. Availability models using % values fit the same rules as for reliability models using % values—so once you know the rules it’s fairly easy to get the system availability or system reliability.
How much reliability and availability do you need? The answer depends upon money. Tradeoffs must be made to find the right combinations for the lowest long-term cost of ownership, which is a life cycle cost consideration.
You can download a PDF copy of this page (187K files size).
Return to Barringer & Associates, Inc. homepage