Chapter Contents Previous Next
 HISTOGRAM Statement

Example 4.2: Fitting Lognormal, Weibull, and Gamma Curves

 See CAPCURV in the SAS/QC Sample Library

To find an appropriate model for a process distribution, you should consider curves from several distribution families. As shown in this example, you can use the HISTOGRAM statement to fit more than one type of distribution and display the density curves on the same histogram. The gap between two plates is measured (in cm) for each of 50 welded assemblies selected at random from the output of a welding process assumed to be in statistical control. The lower and upper specification limits for the gap are 0.3 cm and 0.8 cm, respectively. The measurements are saved in a data set named PLATES.

   data plates;
label gap='Plate Gap in cm';
input gap @@;
datalines;
0.746  0.357  0.376  0.327  0.485  1.741  0.241  0.777  0.768
0.409  0.252  0.512  0.534  1.656  0.742  0.378  0.714  1.121
0.597  0.231  0.541  0.805  0.682  0.418  0.506  0.501  0.247
0.922  0.880  0.344  0.519  1.302  0.275  0.601  0.388  0.450
0.845  0.319  0.486  0.529  1.547  0.690  0.676  0.314  0.736
0.643  0.483  0.352  0.636  1.080
;


The following statements fit three distributions (lognormal, Weibull, and gamma) and display their density curves on a single histogram:

   title1 'Distribution of Plate Gaps';
legend1 frame cframe=ligr cborder=black position=center;
proc capability data=plates noprint;
var gap;
specs  lsl = 0.3  llsl = 3   clsl=black
usl = 0.8  lusl = 20  cusl=black;
histogram /
midpoints=0.2 to 1.8 by 0.2
lognormal (l=1  color=red)
weibull   (l=2  color=blue)
gamma     (l=8  color=yellow)
nospeclegend
vaxis   =  axis1
cframe  = ligr
legend  = legend1;
inset n mean(5.3) std='Std Dev'(5.3) skewness(5.3)
/ pos = ne  header = 'Summary Statistics' cfill = blank;
axis1  label=(a=90 r=0);
run;


The LOGNORMAL, WEIBULL, and GAMMA options superimpose fitted curves on the histogram in Output 4.2.1. The L= options specify distinct line types for the curves. Note that a threshold parameter is assumed for each curve. In applications where the threshold is not zero, you can specify with the THETA= option.

Output 4.2.1: Superimposing a Histogram with Fitted Curves

The LOGNORMAL, WEIBULL, and GAMMA options also produce the summaries for the fitted distributions shown in Output 4.2.2, Output 4.2.3, and Output 4.2.4.

Output 4.2.2: Summary of Fitted Lognormal Distribution

 Distribution of Plate Gaps

 The CAPABILITY Procedure Fitted Lognormal Distribution for gap

 Parameters for Lognormal Distribution Parameter Symbol Estimate Threshold Theta 0 Scale Zeta -0.58375 Shape Sigma 0.499546 Mean 0.631932 Std Dev 0.336436

 Goodness-of-Fit Tests for Lognormal Distribution Test Statistic DF p Value Kolmogorov-Smirnov D 0.06441431 Pr > D >0.150 Cramer-von Mises W-Sq 0.02823022 Pr > W-Sq >0.500 Anderson-Darling A-Sq 0.24308402 Pr > A-Sq >0.500 Chi-Square Chi-Sq 7.51762213 6 Pr > Chi-Sq 0.276

 Quantiles for Lognormal Distribution Percent Quantile Observed Estimated 1.0 0.23100 0.17449 5.0 0.24700 0.24526 10.0 0.29450 0.29407 25.0 0.37800 0.39825 50.0 0.53150 0.55780 75.0 0.74600 0.78129 90.0 1.10050 1.05807 95.0 1.54700 1.26862 99.0 1.74100 1.78313

Output 4.2.2 provides four goodness-of-fit tests for the lognormal distribution: the chi-square test and three tests based on the EDF (Anderson-Darling, Cramer-von Mises, and Kolmogorov-Smirnov). See "Chi-Square Goodness-of-Fit Test" and "EDF Goodness-of-Fit Tests" for more information. The EDF tests are superior to the chi-square test because they are not dependent on the set of midpoints used for the histogram.

At the significance level, all four tests support the conclusion that the two-parameter lognormal distribution with scale parameter , and shape parameter provides a good model for the distribution of plate gaps.

Output 4.2.3: Summary of Fitted Weibull Distribution

 Distribution of Plate Gaps

 The CAPABILITY Procedure Fitted Weibull Distribution for gap

 Parameters for Weibull Distribution Parameter Symbol Estimate Threshold Theta 0 Scale Sigma 0.719208 Shape C 1.961159 Mean 0.637641 Std Dev 0.339248

 Goodness-of-Fit Tests for Weibull Distribution Test Statistic DF p Value Cramer-von Mises W-Sq 0.1593728 Pr > W-Sq 0.016 Anderson-Darling A-Sq 1.1569354 Pr > A-Sq <0.010 Chi-Square Chi-Sq 15.0252996 6 Pr > Chi-Sq 0.020

 Quantiles for Weibull Distribution Percent Quantile Observed Estimated 1.0 0.23100 0.06889 5.0 0.24700 0.15817 10.0 0.29450 0.22831 25.0 0.37800 0.38102 50.0 0.53150 0.59661 75.0 0.74600 0.84955 90.0 1.10050 1.10040 95.0 1.54700 1.25842 99.0 1.74100 1.56691

Output 4.2.3 provides two EDF goodness-of-fit tests for the Weibull distribution: the Anderson-Darling and the Cramer-von Mises tests. (See Table 4.17 for a complete list of the EDF tests available in the HISTOGRAM statement.) The probability values for the chi-square and EDF tests are all less than 0.10, indicating that the data do not support a Weibull model.

Output 4.2.4: Summary of Fitted Gamma Distribution

 Distribution of Plate Gaps

 The CAPABILITY Procedure Fitted Gamma Distribution for gap

 Parameters for Gamma Distribution Parameter Symbol Estimate Threshold Theta 0 Scale Sigma 0.155198 Shape Alpha 4.082646 Mean 0.63362 Std Dev 0.313587

 Goodness-of-Fit Tests for Gamma Distribution Test Statistic DF p Value Chi-Square Chi-Sq 12.3075959 6 Pr > Chi-Sq 0.055

 Quantiles for Gamma Distribution Percent Quantile Observed Estimated 1.0 0.23100 0.13326 5.0 0.24700 0.21951 10.0 0.29450 0.27938 25.0 0.37800 0.40404 50.0 0.53150 0.58271 75.0 0.74600 0.80804 90.0 1.10050 1.05392 95.0 1.54700 1.22160 99.0 1.74100 1.57939

Output 4.2.4 provides a chi-square goodness-of-fit test for the gamma distribution. (None of the EDF tests are currently supported when the scale and shape parameter of the gamma distribution are estimated; see Table 4.17.) The probability value for the chi-square test is less than 0.10, indicating that the data do not support a gamma model.

Based on this analysis, the fitted lognormal distribution is the best model for the distribution of plate gaps. You can use this distribution to calculate useful quantities. For instance, you can compute the probability that the gap of a randomly sampled plate exceeds the upper specification limit, as follows:

where Z has a standard normal distribution, and is the standard normal cumulative distribution function. Note that can be computed with the DATA step function PROBNORM. In this example, USL = 0.8 and Pr[ gap > 0.8] = 0.2352. This value is expressed as a percent (Est Pct > USL) in Output 4.2.2.

 Chapter Contents Previous Next Top