Chapter Contents Previous Next
 The CATMOD Procedure

## Example 22.3: Logistic Regression, Standard Response Function

In this data set, from Cox and Snell (1989), ingots are prepared with different heating and soaking times and tested for their readiness to be rolled. The response variable Y has value 1 for ingots that are not ready and value 0 otherwise. The explanatory variables are Heat and Soak.

   title 'Maximum Likelihood Logistic Regression';
data ingots;
input Heat Soak nready ntotal @@;
Y=1;
output;
Y=0;
output;
datalines;
7 1.0 0 10   14 1.0 0 31   27 1.0 1 56   51 1.0 3 13
7 1.7 0 17   14 1.7 0 43   27 1.7 4 44   51 1.7 0  1
7 2.2 0  7   14 2.2 2 33   27 2.2 0 21   51 2.2 0  1
7 2.8 0 12   14 2.8 0 31   27 2.8 1 22   51 4.0 0  1
7 4.0 0  9   14 4.0 0 19   27 4.0 1 16
;


Logistic regression analysis is often used to investigate the relationship between discrete response variables and continuous explanatory variables. For logistic regression, the continuous design-effects are declared in a DIRECT statement. The following statements produce Output 22.3.1 through Output 22.3.7.

   proc catmod data=ingots;
weight Count;
direct Heat Soak;
model Y=Heat Soak / freq covb corrb;
quit;


Output 22.3.1: Maximum Likelihood Logistic Regression

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Response Y Response Levels 2 Weight Variable Count Populations 19 Data Set INGOTS Total Frequency 387 Frequency Missing 0 Observations 25

 Population Profiles Sample Heat Soak Sample Size 1 7 1 10 2 7 1.7 17 3 7 2.2 7 4 7 2.8 12 5 7 4 9 6 14 1 31 7 14 1.7 43 8 14 2.2 33 9 14 2.8 31 10 14 4 19 11 27 1 56 12 27 1.7 44 13 27 2.2 21 14 27 2.8 22 15 27 4 16 16 51 1 13 17 51 1.7 1 18 51 2.2 1 19 51 4 1

You can verify that the populations are defined as you intended by looking at the "Population Profiles" table in Output 22.3.1.

Output 22.3.2: Response Summaries

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Response Profiles Response Y 1 0 2 1

 Response Frequencies Sample Response Number 1 2 1 10 0 2 17 0 3 7 0 4 12 0 5 9 0 6 31 0 7 43 0 8 31 2 9 31 0 10 19 0 11 55 1 12 40 4 13 21 0 14 21 1 15 15 1 16 10 3 17 1 0 18 1 0 19 1 0

Since the "Response Profiles" table shows the response level ordering as 0, 1, the default response function, the logit, is defined as log([(pY = 0)/(pY = 1)]).

Output 22.3.3: Iteration History

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Maximum Likelihood Analysis Iteration Sub Iteration -2 LogLikelihood Convergence Criterion Parameter Estimates 1 2 3 0 0 536.49592 1.0000 0 0 0 1 0 152.58961 0.7156 2.1594 -0.0139 -0.003733 2 0 106.76066 0.3003 3.5334 -0.0363 -0.0120 3 0 96.692171 0.0943 4.7489 -0.0640 -0.0299 4 0 95.383825 0.0135 5.4138 -0.0790 -0.0498 5 0 95.345659 0.000400 5.5539 -0.0819 -0.0564 6 0 95.345613 4.8289E-7 5.5592 -0.0820 -0.0568 7 0 95.345613 7.73E-13 5.5592 -0.0820 -0.0568

 Maximum likelihood computations converged.

Seven Newton-Raphson iterations are required to find the maximum likelihood estimates.

Output 22.3.4: Analysis of Variance Table

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Maximum Likelihood Analysis of Variance Source DF Chi-Square Pr > ChiSq Intercept 1 24.65 <.0001 Heat 1 11.95 0.0005 Soak 1 0.03 0.8639 Likelihood Ratio 16 13.75 0.6171

The analysis of variance table (Output 22.3.4) shows that the model fits since the likelihood ratio goodness-of-fit test is nonsignificant. It also shows that the length of heating time is a significant factor with respect to readiness but that length of soaking time is not.

Output 22.3.5: Maximum Likelihood Estimates

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Analysis of Maximum Likelihood Estimates Effect Parameter Estimate StandardError Chi-Square Pr > ChiSq Intercept 1 5.5592 1.1197 24.65 <.0001 Heat 2 -0.0820 0.0237 11.95 0.0005 Soak 3 -0.0568 0.3312 0.03 0.8639

Output 22.3.6: Covariance Matrix

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Covariance Matrix of the Maximum LikelihoodEstimates 1 2 3 1 1.2537133 -0.0215664 -0.2817648 2 -0.0215664 0.0005633 0.0026243 3 -0.2817648 0.0026243 0.1097020

Output 22.3.7: Correlation Matrix

 Maximum Likelihood Logistic Regression

 The CATMOD Procedure

 Correlation Matrix of the Maximum LikelihoodEstimates 1 2 3 1 1.00000 -0.81152 -0.75977 2 -0.81152 1.00000 0.33383 3 -0.75977 0.33383 1.00000

From the table of maximum likelihood estimates (Output 22.3.5), the fitted model is
E( logit(p)) = 5.559 - 0.082( Heat) - 0.057( Soak)
For example, for Sample 1 with Heat =7 and Soak =1, the estimate is
E( logit(p)) = 5.559 - 0.082(7) - 0.057(1) = 4.9284
Predicted values of the logits, as well as the probabilities of readiness, could be obtained by specifying PRED=PROB in the MODEL statement. For the example of Sample 1 with Heat =7 and Soak =1, PRED=PROB would give an estimate of the probability of readiness equal to 0.9928 since
implies that
As another consideration, since soaking time is nonsignificant, you could fit another model that deleted the variable Soak.

 Chapter Contents Previous Next Top