For normal (Gaussian) distributions, the coefficient of variation measures the relative scatter in data with respect to the mean. The C_{v} is a term often used for sixsigma activities to express relative variations. It is also useful for calculating losses in process reliability problems when used with the Weibull distribution.
You can:
1) Page down for this month's problem statement.
2) Return to the list of monthly problems by clicking here.
3) Bypass the background information and go directly
to the problem statement
by clicking
here.
Background
The coefficient of variation provides a relative
measure of data dispersion compared to the mean: C_{v}=s/Xbar for the normal (bell shaped) distribution.
The coefficient of variation has no units. It may be reported as a simple
decimal value or it may be reported as a percentage as mentioned in the January '98 Problem Of The Month.
When the C_{v} is small, the data scatter
compared to the mean is small. When the C_{v}
is large compared to the mean, the amount of variation is large. For example,
the variation in the amount of pocket change compared to the average amount of
money you have in your wallet may be small to medium when measured by the
coefficient of variation. However, the variation in the amount of your pocket
change compared to the average
Scatter in the bellshaped normal curve will include six sigma = ±3s = 6s = 99.73% of the area under the curve. A 6sigma representation will cover 99.73% of the expected data occurrences considering the range of 3s at CDF = 0.135% and + 3s at CDF = 99.865%. Small standard deviations usually indicate data are closely clustered about the arithmetic mean. Large standard deviations usually indicate data are spreadout and widely dispersed.
We will use some general ideas about the traditional figures of 99.73% of the data to look for "most of the data under the curve" even with the nonsymmetrical Weibull distribution. Remember the skewed shape of most Weibull curves represent reality but lack of symmetry cause problems with the explanations. For our Weibull six sigma’s I’ll adopt the convention of the area between 0.1% and 99.9% which equals 99.8% of the data.
The coefficient of variation for the Weibull distribution depends upon the shape factor, beta. This is a very handy feature for use with straight line Weibull probability plots. Thus the slope of the line determined by beta tells you the coefficient of variation—large beta values give small variations whereas small betas give large variations.
You can:
1) Page down for this month's problem statement.
2) Return to the list of monthly problems by clicking here.
The Problem
What is the coefficient of variation for the Weibull distribution?
The New Weibull Handbook
in Appendix G2 defines:
the Weibull standard deviation as
s = h*{G(1+2/b)  [G(1+1/b))]^2}^0.5, ß this says s = h*( a relationship with b) and
the Weibull mean
m = h*{G(1+1/b)}, ß this says m = h*( another relationship with b)
When the coefficient of variation (C_{v}) terms for s/m are collected, the characteristic value h drops out. Then
C_{v} = s/m = {G(1+2/b)  [G(1+1/b)]^2}^0.5 / G(1+1/b), ß this says C_{v} = a function of b.
For C_{v} , the complicated gamma functions in numerator and
denominator are all written in terms of b.
If you evaluate C_{v} = s/m using Excel, you would have this expression (assuming b is located in cell A1):
=((EXP(GAMMALN((2+A1)/A1))(EXP(GAMMALN((1+A1)/A1)))^2)^0.5/EXP(GAMMALN((1+A1)/A1)))
{Test case: if b = 4.9, C_{v} = 0.233317}
The complicated Excel equation can be simplified for a fairly good fit with the line segments shown in the graph below:
Hence a specified coefficient of variation will have a fixed slope on the Weibull plot. So, when a line is drawn by WinSMITH Weibull the amount of variation is decided and identified. Steep beta’s (i.e., large values for beta) have small variations in the data (and thus the C_{v} is small). Shallow beta's (i.e., small values for beta) have large variations in the data (and thus the C_{v} is large). For most issues, you want steep betas with small C_{v}.
A few other tips to think about:
1)
Knowing the Gaussian normal C_{v} does not imply that you can find the
Weibull C_{v} or the Weibull b
values.
2)
A Weibull probability plot of normal data produces a reasonably good straight
line plot.
3)
A normal probability plot of Weibull data often does not produce a very good
straight line.
4)
A Weibull probability plots tell you details about the distribution rather than
you specifying only a
single (normal) distribution.
5) Using normal distributions
for manufacturing data is often wrong as the data contains drift effects and
other biases which may be larger than the
normal errors.
6)
Perhaps Weibull control charts are more robust than standard SPC charts for the
reason given in 5) above.
7)
The C_{v} is a more interesting concept for Statisticians but not it is
not a particularly helpful value for use
with Engineers.
How do the coefficients of variation relate to the Weibull shape factors
identified as b?
Mathcad was used to evaluate the Weibull coefficient of variation at specific values by solving for beta. The results are shown in the following table.
Weibull Beta Values For Weibull Coefficient Of Variations = s/m 

C_{V} 
b 
C_{V} 
b 
C_{V} 
b 
C_{v} 
b 
2.0 
0.5427 
0.5 
2.1014 
0.05 
24.951 
0.01 
127.54 
Suppose you made a Weibull plot. The plot showed beta (the shape factor) = 10 and eta (the scale factor which in reliability terms would be the characteristic life) = 1000. You would find the coefficient of variation = 0.1203 or 12%. You would also find the Weibull standard deviation = 114.457 and the Weibull mean = 951.3508.
This is mathematically correct even though few people use the Weibull mean.
How do you connect the nonsymmetrical Weibull coefficient of variation
related to the symmetrical normal distribution six sigma concepts?
The normal distribution has a bell shaped curve. Weibull curves are not usually bell shaped. Thus Weibull six sigma concepts will not be as neat and tidy as for the normal distribution.
Using the example above, three Weibull standard deviations either side of the Weibull mean is 951.3508 ± 3*114.457 which results in 607.98 and 1294.7. Using Excel to evaluate the Weibull F(t) = (1e^(t/h)^b) for t = 1294.7 is 99.99982149% and F(t) for 607.98 is 0.68768286984%so this covers the range of 99.31213862% of the data for this nonsymmetrical distribution. Of course this is not quite the same range as you would expect for the normal equation of ±3*s = 99.73% which is symmetrical.
Note the right hand tail of the Weibull probability density function contains 0.000178507% of the data while the left hand tail of the Weibull distribution contains 0.68768286984% of the data (or roughly 3853 times more data in the right hand tail) for this particular set of b and h. Right away, you see the Weibull curve is going to complicate the concept of explaining and computing what's a six sigma value for the Weibull distribution. So consider the following rules of thumb to avoid mind numbing complex issues.
Consider this simple, rule of thumb approximation for ±3*s = 99.73% = 6*s. Take the value at 99.9% occurrence (1213.2) and the compliment which occurs at 0.1% (501.2) and thus 99.9%  0.1% = 99.8% (which is close to 99.73%) of the data is expected to lie between 1213.2 and 501.2 which corresponds closely to the 6s concept of 99.73% of the data lies under the normal curve.
Earlier I had proposed another rule of thumb simplification. Seldom is the mean, m, used in the Weibull distribution as the “best central tendency” instead the mathematical value of the characteristic value h is used. The h simplification was OK for large betas (what you desire for process reliability issues) but not OK for small betas (less than 1). I’m withdrawing this h simplification as it causes too many useless arguments! I will stick with the traditional definition C_{v} = s/m = (Weibull standard deviation) / (Weibull mean) as used in many 6sigma studies. You can see the variability in the table shown below.


















Weibull Beta Values For Weibull Coefficient Of Variations = s/m 

C_{V} 
b 
C_{V} 
b 
C_{V} 
b 
C_{v} 
b 
2.0 
0.5427 
0.5 
2.1014 
0.05 
24.951 
0.01 
127.54 
For process reliability problems the value of b = 100 seems to be a practical value for world class performance and this produces C_{v} = 0.0127 or 1.27% of variability around the Weibull mean.
An Example:
What are some typical values for the coefficient of variation based on C_{v} = s/m = (Weibull standard
deviation) / (Weibull mean)?
Since the coefficient of variation is a relative measure, the absolute values depend upon the situation. Consider the example of money in your pocket. Assume the characteristic value is m = US$100, notice how the coefficient of variation will show the scatter in your pocket money at CDF = 99.9% and CDF = 0.1% for ~6*s range. The way you would calculate the Low/High values is based on {m/{G(1+1/b)}*{ln[1/(1CDF)]}^(1/b) and for the low end you would use 0.001 for the CDF, likewise for the upper end you would use 0.999 for the CDF. Note that m = h*{G(1+1/b)}, and you will recognize the first term in the brackets as h.
Typical Values For The Coefficient Of Variation For Process Variation 

Plain English 
Example of C_{V} = s/m % 
99.8% ~6*s 
Poor control 
C_{v} = 20% > b = 5.797 
For b = 5.797 and m = US$100, then compute h
= $107.9979 
Fair control 
C_{v} = 10% > b = 12.153 
For b = 12.153 and m = US$100, then compute h
= $104.3039 
Tight control 
C_{v} = 5% > b = 24.949 
For b = 24.949 and m = US$100, then compute h
= $102.20798 
Excellent control 
C_{v} = 2.5% > b = 50.586 
For b = 50.586 and m = US$100, then compute h
= $101.115397 
World
class 
C_{v} = 1.25% > b = 101.880 
For b = 101.880 and m = US$100, then compute h
= $100.56024 
Seldom achieved 
C_{v} = 0.625% > b = 204.480 
For b = 204.480 and m = US$100, then compute h
= $100.2807 
Large coefficients of variation say you’ll have big variations in the amount of money in your pockets. Small coefficients of variation say you will know within a very small range how much money will be in your pocket. Of course having lots of money in your pocket may be desirable but practical limits exist as to the maximum amount of money you will have in you own pockets—the same case exists for production output.
The coefficient of variation will be used to set the nameplate rating for production processes in future Problems Of The Month for March '98.
Other pages you may want to visit concerning similar issue are:
· Production Reliability Example With Nameplate Ratings
· Key Performance Indicators From Weibull Production Plots
· Process Reliability Plots With Flat Line Slopes
· Process Reliability Line Segments
· Papers On Process Reliability As PDF Files For Nocharge Downloads
Return to the list of problems by clicking here. Return to top of this problem statement clicking here.
Comments:
Refer to the caveats on the Problem Of The Month Page about the limitations of the following solution. Maybe you have a better idea on how to solve the problem. Maybe you find where I've screwedup the solution and you can point out my errors as you check my calculations. Email your comments, criticism, and corrections to: Paul Barringer by clicking here.
Technical tools are only interesting toys for engineers until results are converted into a business solution involving money and time. Complete your analysis with a bottom line which converts $'s and time so you have answers that will interest your management team!
Thanks to John Hawkins of PPG Industries for catching a typo that
changed the incorrect phrase “…nonsymmetrical Weibull Cv to symmetrical
6sigma…” to the correct phrase “…nonsymmetrical Weibull Cv relate to the
symmetrical 6sigma…” and the need to add a closing “)”
to the C_{v} equations copied to ExcelI’ve also added a test case
for each equation so you can validate your results. HPB