Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOGISTIC Procedure

Example 39.7: Goodness-of-Fit Tests and Subpopulations

A study is done to investigate the effects of two binary factors, A and B, on a binary response, Y. Subjects are randomly selected from subpopulations defined by the four possible combinations of levels of A and B. The number of subjects responding with each level of Y is recorded and entered into data set A.

   data a;
      do A=0,1;
         do B=0,1;
            do Y=1,2;
               input F @@;
               output;
            end;
         end;
      end;
      datalines;
   23 63 31 70 67 100 70 104
   ;

A full model is fit to examine the main effects of A and B as well as the interaction effect of A and B.

   proc logistic data=a;
      freq F;
      model Y=A B A*B;
   run;

Output 39.7.1: Full Model Fit

The LOGISTIC Procedure
Model Information
Data Set WORK.A
Response Variable Y
Number of Response Levels 2
Number of Observations 8
Frequency Variable F
Sum of Frequencies 528
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered
Value
Y Total
Frequency
1 1 191
2 2 337
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Criterion Intercept
Only
Intercept
and
Covariates
AIC 693.061 691.914
SC 697.330 708.990
-2 Log L 691.061 683.914
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 7.1478 3 0.0673
Score 6.9921 3 0.0721
Wald 6.9118 3 0.0748
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Chi-Square Pr > ChiSq
Intercept 1 -1.0074 0.2436 17.1015 <.0001
A 1 0.6069 0.2903 4.3714 0.0365
B 1 0.1929 0.3254 0.3515 0.5533
A*B 1 -0.1883 0.3933 0.2293 0.6321
Association of Predicted Probabilities and
Observed Responses
Percent Concordant 42.2 Somers' D 0.118
Percent Discordant 30.4 Gamma 0.162
Percent Tied 27.3 Tau-a 0.054
Pairs 64367 c 0.559


Pearson and Deviance goodness-of-fit tests cannot be obtained for this model since a full model containing four parameters is fit, leaving no residual degrees of freedom. For a binary response model, the goodness-of-fit tests have m-q degrees of freedom, where m is the number of subpopulations and q is the number of model parameters. In the preceding model, m=q=4, resulting in zero degrees of freedom for the tests.

Results of the model fit are shown in Output 39.7.1. Notice that neither the A*B interaction nor the B main effect is significant. If a reduced model containing only the A effect is fit, two degrees of freedom become available for testing goodness of fit. Specifying the SCALE=NONE option requests the Pearson and deviance statistics. With single-trial syntax, the AGGREGATE= option is needed to define the subpopulations in the study. Specifying AGGREGATE=(A B) creates subpopulations of the four combinations of levels of A and B. Although the B effect is being dropped from the model, it is still needed to define the original subpopulations in the study. If AGGREGATE=(A) were specified, only two subpopulations would be created from the levels of A, resulting in m=q=2 and zero degrees of freedom for the tests.

   proc logistic data=a;
      freq F;
      model Y=A / scale=none aggregate=(A B);
   run;

Output 39.7.2: Reduced Model Fit

The LOGISTIC Procedure
Model Information
Data Set WORK.A
Response Variable Y
Number of Response Levels 2
Number of Observations 8
Frequency Variable F
Sum of Frequencies 528
Link Function Logit
Optimization Technique Fisher's scoring
Response Profile
Ordered
Value
Y Total
Frequency
1 1 191
2 2 337
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Deviance and Pearson Goodness-of-Fit Statistics
Criterion DF Value Value/DF Pr > ChiSq
Deviance 2 0.3541 0.1770 0.8377
Pearson 2 0.3531 0.1765 0.8382
Number of unique profiles: 4
Model Fit Statistics
Criterion Intercept
Only
Intercept
and
Covariates
AIC 693.061 688.268
SC 697.330 696.806
-2 Log L 691.061 684.268
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 6.7937 1 0.0091
Score 6.6779 1 0.0098
Wald 6.6210 1 0.0101
Analysis of Maximum Likelihood Estimates
Parameter DF Estimate Standard
Error
Chi-Square Pr > ChiSq
Intercept 1 -0.9013 0.1614 31.2001 <.0001
A 1 0.5032 0.1955 6.6210 0.0101
Association of Predicted Probabilities and
Observed Responses
Percent Concordant 28.3 Somers' D 0.112
Percent Discordant 17.1 Gamma 0.246
Percent Tied 54.6 Tau-a 0.052
Pairs 64367 c 0.556


The goodness-of-fit tests (Output 39.7.2) show that dropping the B main effect and the A*B interaction simultaneously does not result in significant lack of fit of the model. The tests' large p-values indicate insufficient evidence for rejecting the null hypothesis that the model fits.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.