Improved Goodness of Fit: P-value of the Correlation Coefficient


By Wes Fulton



There is an improved goodness of fit based upon the correlation coefficient, r, called P-value. This section presents an overview of correlation coefficient P-value and its relationship to the critical correlation coefficient, CCC. The [The New Weibull Handbook] author’s (Dr. Abernethy’s) CCC value is that value of r equivalent to 10% P-value. The CCC represents a lower acceptance bound on r (or equivalently a lower bound on the square of correlation coefficient r^2) at 10% significance.  Values of actual r below the CCC indicate extraordinarily poor fit and only happen 1 out of 10 times on average in good sampling. The CCC is known to increase as sample size increases and it also depends on the type of model, Weibull or lognormal, etc.


The true P-value of correlation for a particular data set is the ranking of actual correlation among all the possible correlation values for the sample size of the data set and the model type selected. P-values for probability plotting must be between 0% and 100%, the higher the better. The author suggested that P-value display should be implemented in software to facilitate model selection and approval. Wes Fulton has developed the capability in the WSW [WinSMITH Weibull] software to estimate the true P-value for the Weibull 2-parameter, Weibull 3-parameter, lognormal, normal, and extreme value (Gumbel) distributions and equivalent models. There are two P-value estimates implemented in the WSW software, the prr-value(%) and the pve%.


You can get a relatively precise estimate of the true P-value, designated prr-value(%), after pivotal rank regression confidence has been calculated. The prr-value(%) when available is displayed only in the WSW software report output. The accuracy of this estimate can be improved by increasing the quantity of Monte Carlo trials. However, for each use of a different seed value or a different trial quantity there will be a slightly different prr-value(%) found. A prr-value(%) less than 10% is considered low since it is less than the CCC. If the prr-value(%) is above 10% then the model is not rejected at a 10% significance level. Suspension effects are taken into account in the prr-value(%) for Type II censoring (all suspensions having magnitude equal to the highest occurrence point). The prr-value(%) only approximates the effects of other arrangements of suspension values.


The alternative to prr-value(%) is an instantaneous estimate of the true P-value without further simulation. It is designated pve%, and it is displayed for both the software plot and report output when selected. It comes from previous Monte Carlo simulation and modeling by Wes Fulton. The pve% value should be within a few percent of the prr-value(%) if both are calculated. At pve% values close to 10% there should be even closer agreement since the pve% is based upon the CCC values already calculated and modeled. Similar to the prr-value(%), a pve% less than 10% is considered low. However, if the pve% is above 10% then the model is not rejected at a 10% significance level. Note that the pve% was modeled only for complete samples similar to the way the CCC was modeled. When estimating P-value with pve%, the sample size is taken to be the quantity of failures only. Samples with suspensions present tremendous variations in proportion suspended, suspension type, and suspension value relationship effects. Using both the CCC and the pve% in this way has resulted generally in good service. Use the prr-value(%) described above if a more precise P-value estimate is needed especially for censored data.


We have many benefits from using the P-value as the goodness of fit. Chi Chao Liu [1997] [download the table of contents  as a 1.2 meg PDF file, or the complete dissertation as a 15.8 Meg file] concluded for regression that the P-value provides the best indication of goodness-of-fit. Both correlation P-value estimates prr-value(%) and pve% retain the same meaning independent of sample size and independent of statistical model. They also can be used directly for model comparison and selection. Distribution analysis in WSW determines the model with the best fit using the highest value of pve% as the indicator of the best selection. The pve% goodness of fit measurement for regression is now the default for the WSW software. However, there is still the capability [in WSW] to show prr-value(%) or r^2 or r^2-CCC^2 in the results.


© Fulton Findings 2004  Published with permission by Wes Fulton to Barringer & Associates, Inc. May 2, 2006


You can download this as a PDF file.

Return to Barringer & Associates, Inc. homepage

Last revised 5/02/2006
© Barringer & Associates, Inc. 2006