A Simple Test For Mixed Failures
 
H. Paul Barringer, PE
Barringer & Associates, Inc.
Humble, Texas
hpaul@barringer1.com

Very Brief Summary-
Weibull analysis often fails to clearly show the details of mixed failure modes.  Adding Crow-AMSAA analysis to the Weibull analysis provides information for separation of some failure modes in a Weibull plot.  Failure to record and use suspended (censored data) is a fatal Weibull analysis.

The Big Picture-
Mixed failure modes dumped into a single data set is a common problem often solved by use of SuperSMITH Weibull for simple mixtures and SuperSMITH Ybath for very complicated mixed failure modes.  Knowledgeable and experienced engineers won’t believe every answer coming from software analysis of many data sets!  Experienced engineers know to use their heads for thinking and the software for grunting out the answers.  Unknowledgeable and naive engineers will believe each computer analysis.

When your engineering head and the engineering analysis match the hypothesis of your experience level then you have a high probability of getting the correct answer.  Experienced engineers also know when you’ve got a mismatch between the data and the analysis results says you better dig deeper because there is more to the story. 

Engineers are paid to produce positive financial results from solving engineering problems.  If the problem is always simple and the analysis is always simple why would you need engineers to solve the issues—in this case you’re spending too much money to get too little results?  In short, engineers make yourself useful in your analysis rather than performing like a robot and believing every answer from the software analysis.  Use your head and use your software to solve problems and reduce further failures to make systems more reliable. 

Dr. Henry Petroski, Professor at Duke University, and author of many books on failures (he frequently writes about bridge failures because no one can deny when a bridge fails).  Dr. Petroski makes these points about modern engineering traps we walk into without thinking: 
     1) Before wide use of computers-
          Engineers had to solve problems with slide rules which required thinking problems
          thru very carefully to get units, etc. correct.
     2) After wide use of computers-
          Engineers stopped testing every answer to make sure the units were correct and
          engineers stopped testing basic concepts because everything produced by a computer
          is believed to be absolutely correct—this is not the time to be naive! 
In the computer world we work our self into corners by having too much faith in computer answers!  In short, pull your head out of the dark corners of the lower portion of your body to think and test the results!

In case you doubt the concept of 2) above see the 1998 Mars Climate Satellite failure on mismatched units in calculations.

First Dataset-
Figure 1 contains up to 50 failures in time sequence from up to 50 failures recorded.  The data in Figure 1 represents running the samples until failure was recorded

Figure 1:

Figure 2 shows a Weibull plot of the ages to failure from Figure 1:

Figure 2: A Weibull Plot From Figure 1 Data

 

The Weibull plot from Figure 1’s data between 100 cycles and up to 10,000 cycles shows an unacceptable curve fit with a pve% = 1.930 whereas pve% values of 10 or more show a good curve fit.  Also the data appears to be in a trend of concave downward appearance on a Weibull plot.

Before you convert Figure 2 into a 3-parameter Weibull plot, you must conform to 4 mandatory requirements for use of a 3-parameter Weibull as described in The New Weibull Handbook, 5th edition, page 3-10:

1.      You must see curvature (concavity pointed downward in this case) of data points
              in a 2-parameter Weibull plot. 
              We meet this curvature appearance on a 2-parameter Weibull plot. 

2.      There must be a physical explanation for the displacement of the X-axis which
              causes curvature.  Curvature (concavity pointed downward) implies displacement
              of the X-axis to the right for a failure free time period and the displacement t0,
              will be positive.  Curvature (concavity pointed upward) implies displacement of
              the X-axis to the left for previous failures not reported in the population which
              have been removed from the dataset, and the displacement of the curve will
              be to the left.  Displacement of the axis to the left implies t0, will be negative. 
              Getting a better curve fit to the data is not a physical reason for use of a
              3-parameter t0!
 
              We fail for lack a physical reason for curvature on a 2-parameter plot.
L

3.      You need at least 21 failures (more data if the curvature is slight). 
              We meet this requirement with 50 failures.

4.      The pve% value must improve with a 3-parameter analysis.  A 2-parameter shows
              pve% = 1.93 represents a poor curve fit.  A 3-parameter analysis shows a
              pve% = 10.68% which represents an improvement for a good curve fit and above
              the minimum requirement of pve% of 10 or higher for a satisfactory improvement. 
              The 3-paramter t0 = -89.05 adjusted the curved line into a straight
              line with an improvement in pve%.
              We meet this criterion.


We flunk requirement #2 for physical explanation of curvature!  Therefore we cannot use a 3-parameter Weibull distribution.


A different analysis may help us understand this dataset and provide a positive response to #2 above.  Why do we have a poor 2-paramerter curve fit?  Why does the 2-parameter Weibull plots show a beta value near 1 (literally the beta = 1.098 as shown in Figure 2)?  We expected the data would show a stronger wearout failure mode with a larger beta value. 

Next we’ll make a Crow-AMSAA log-log reliability growth plot.  The X-axis represents cumulative time.  The Y-axis represents cumulative failures. 

We also asked the test crew for:
              1) age to failure data as recorded in time sequence, and
              2) other data that they think might be helpful.
We learned from the test crew ages to failure were recorded.  The three vendors painted different color codes on the parts.  The parts were test for each vendor as batches by color for each of three different vendors.

Here is the log-log plot without the trendlines:

Figure 3: A log-log plot of cum failures vs cum cycles

Data points 1-15 have a moderate slope, data points 16-35 have a steeper slope (very undesirable), and data points 36-50 have a much flatter slope (very desirable).  Each of these trend lines represents a different vendor by color code and the data was presented in this order so it’s easy to take the ages to failure by failure mode and calculate the Weibull lines for each group of data as shown in Figure 4!

Figure 4: A Crow AMSAA Plot Of The Vendors Test

Notice the good curve fits (Fit-p% >10) for all three Crow-AMSA lines.  Also note that data points 15 thru 35 are used to compute the slope for the steep slope and data points 35-50 are used to compute the trendline for the flat slope, this is done to reduce the huge gaps that can result in the trendline by being too pure about the trendline and to avoid the huge gaps that can result in MTBF plots by being too pure in how to handle the data.

To avoid warranty failures and longer in-service failures, the vendor with the flat slope should become our preferred vendor.  The vendor with the steep slot should be excluded as a vendor.  The moderate slope vendor should be our backup source until we can persuade them to make improvements.

Now we can develop a horse race between vendors with the following Weibull plot of the data and Figure 5 should be used in discussions with vendor to exclude the inferior vendors and to put pressure on the winning vendor to do even better in the “horse” race.  It’s always amazing what factual data and competitive situations can do for producing more favorable results with the data as presented and explained.

Figure 5: Weibull Results Of Our Life Tests With Data Identified By Different Vendors Color

Notice the beta values are very consistent because the beta values are driven by the physics of failure.  The differences show up in the eta values which are driven by the strength factors.  Use the highest eta value (5222) as the competitive datum.  The runner up vendor is 811.3/5222 = 15% of the winning eta.  The looser eta is 473.1/5222 = 9% of the winning beta!  The nudge to the vendors must be discussions about the Figure 6 plot.  The need for making substantial improvements in their products based on our test results.  Remind the vendors improvements are needed for their completive survival.

A Story Of Competitive Advantage And The Results-
Here is a story of completive pressure story from World War II days based on personal discussions in 1960 (I was so impressed with the story, here in 2016, I still remember it). The conversation was with a physics expert at the University Of Virginia in Charlottesville, VA who was personally involved in the story. 

At the end of WWII America grabbed rocket experts from Germany, because we didn’t have rocket scientist, and brought them to the USA.  The Russians grabbed the German isotope separation experts and took them to Russia because they didn’t have separation experts. 

Russia did not speedily repatriate the Germans at the end of war.   Russia turned their World War II captives loose to return to Germany in 1958 as a result of negotiations with Chancellor Konrad Adenauer and his team.  A German centrifuge expert named Dr. Zippe, another physics expert, who had developed high speed centrifuges (90,000 RPM!) for separation of isotopes in Russia. 

When Dr. Zippe was returned to Germany the USA CIA recruited him to replicate his centrifuge experiments for isotope separation as the USA had already separated U235 by the expensive and difficult gaseous diffusion process—in 1943 diffusion was the winner out of 5 different methods in completive trials including centrifugation.  The USA wanted to understand the centrifuge technology (at that time I was working in Oak Ridge, TN at The Oak Ridge Gaseous Diffusion Plant (K-25) and I was assigned to the centrifugation process development team).  As soon as Dr. Zippe returned to Germany the USA grabbed him and sent him to the University of Virginia to replicate his centrifuge developments in Russia .  Upon completion of his replication efforts in about two years Dr. Zippe was able to return to Germany.  Physicists think differently than engineers and the have many clever ideas.

UVA had two attributes that were desirable for this replication effort.
     1) It was the world center for ultrahigh speed rotation.  Dr. Jesse Beams had already suspended ball bearings (the size of the ball in your ball point pen) by magnets, and  rotated them at 1,000,000 revolutions per minute in a vacuum which was a perfect background for high speed isotope separation centrifuges, and,
     2) The university had a left over laboratory from WWII located behind fences atop a mountain from which you could look down on Monticello the home of the third president of the USA.  The laboratory had horizontal holes drilled thru the mountain from one side to the other.  The laboratory was named The Naval Research Facility.

During WWII on one side of the tunnels were gun crews with ammunition supplied by war vendors.  The other side of the tunnel had armor crews with armor supplied by foundries making armor for WWII.  Neither side of the tunnel was allowed able to speak with the other.

The ammunition vendors were harangued by the military that they were unable to pierce the armor.  They were requested to pep up their ammunition!  They complied with improvements to the point armor was pierced. 

Next, the military took the pierced armor to the foundries.  They raised hell with the foundries claiming their inferior armor was resulting in our boys being killed in combat and they had to pep up the shell resistance!  This scenario persisted during the entire WWII campaign with massive and continuous improvements in both armor and ammunition. 

Bottom line of the story: You’ve got to be able to measure, guide, and direct competitive improvements based on physical test results in a competitive environment!  Otherwise: no competition, no improvements, and frequently deterioration in performance from the lack of completion.

Age To Failure Test Data By Vendor -
The Monte Carlo spreadsheet in Excel offer two practical examples:
              1)  Failure modes in sequence with a Weibull plot and a Crow-AMSAA plot of the simulation data with up to 500,000 simulations.
              2)  Failure modes with three competing failure modes with a Weibull plot and a Crow-AMSAA plot of 1000 or less data points.

Figure 6 shows the set up sheet for in sequence failure modes where the red samples are tested first, then the green samples, and finally the blue samples are tested.

Figure 6: Excel Setup Sheet For In Sequence Failures


Figure 7 shows the Weibull plot for a sample of 50 data points tested in sequence.  Notice the Weibull plot of the mixed failure modes, on a single iteration, has a beta of 0.896 which is near 1 as you should expect from a Weibull plot of mixed failure mode data.

Figure 7: A Weibull Plot For A Test Sample Of 50 Data Points Plotted In Sequence Tested

Figure 8 shows the Crow-AMSSA plot in sequence with the confidence limits for this case obtained by running 10,000 iterations to obtain the confidence intervals.

Figure 8: A Crow-AMSAA Plot Of Failures Showing 90% confidence Intervals

In Figure 8 the “Late” test data shows the preferred slow collection of failure data, whereas the “Mid” data shows very high failure rates, and the “Early” test data shows the second highest failure rate.  The preferred vendor is the “Late” test results with the flatter line slope, the backup vendor is the “Early” test results, and the “Mid” tests vendor is the least preferred and soon to be dropped vendor.  Now we have enough data for instigating a competitive “horse race” for better results from the vendors just as explained in the Naval Research Tests at The University Of Virginia.  Did you also note the very wide spread of test results from the Crow-AMSAA plot with 90% confidence limits?

Figure 9 shows the setup sheet for mixed failure mode data where the different failure modes compete for the life results and the suspensions, the suspension are represented with a minus sign.  Suspensions are a mandatory part of the data set.

Figure 9 Setup Sheet For Mixed Failure Modes—Numbers In Color Represent Failures

Figure 10 shows the Weibull analysis including suspensions.  Notice how close the Weibull results on Figure 10 are to the input data by inclusion of the suspension.  In Figure 9 the n = number of data (suspension + failures) and s = suspensions which in Figure 9 the minus sign represents suspension.  In Figure 10 the pve% > 10 shows a good curve fit.

Figure 10 Weibull Analysis Of Failures Including Suspensions


Figure 10 shows why you MUST include suspensions into the calculations—else you get stupid results!  Compare Figure 10’s correct results and Figure 11’s incorrect results because suspensions were excluded.  Yes, the pve% in Figure 11 is better but the answers are wrong by excluding suspensions!

Figure 11 Weibull Analysis Of Failures Incorrectly EXCLUDING SUSPENSIONS

Figure 12 shows a Crow-AMSAA plot of the mixed failure modes for 1,000 data points.

Figure 12 Shows C-A Line Slope With Beta   ̴ 1 Which Is Expected For Mixed Failure Modes

Summary Of Simulation Findings -
Keeping failure data by failure mode is important.  Sometimes you learn important details by use of a Weibull plot and a Crow-AMSAA plot.  Include your suspended data, it’s very important

Download a copy of this problem as a PDF file.  You can download a copy of the Excel Monte Carlo simulation as a self-extracting zip file identified as Mixed_Failure_Modes.exe.

Send your comments to Paul Barringer.

Click here to return to the top of this page.

Return to Barringer & Associates, Inc. homepage
Last revised July 19, 2016
© Barringer & Associates, Inc. 2016



 

.