Example 22.3: Logistic Regression, Standard Response Function
In this data set, from Cox and Snell (1989), ingots are
prepared with different heating and soaking times and tested
for their readiness to be rolled. The response variable
Y has value 1 for ingots that are not ready and value 0
otherwise. The explanatory variables are Heat and
Soak.
title 'Maximum Likelihood Logistic Regression';
data ingots;
input Heat Soak nready ntotal @@;
Count=nready;
Y=1;
output;
Count=ntotalnready;
Y=0;
output;
drop nready ntotal;
datalines;
7 1.0 0 10 14 1.0 0 31 27 1.0 1 56 51 1.0 3 13
7 1.7 0 17 14 1.7 0 43 27 1.7 4 44 51 1.7 0 1
7 2.2 0 7 14 2.2 2 33 27 2.2 0 21 51 2.2 0 1
7 2.8 0 12 14 2.8 0 31 27 2.8 1 22 51 4.0 0 1
7 4.0 0 9 14 4.0 0 19 27 4.0 1 16
;
Logistic regression analysis is often used to investigate
the relationship between discrete response variables and
continuous explanatory variables. For logistic regression,
the continuous designeffects are declared in a
DIRECT statement. The following statements produce
Output 22.3.1 through Output 22.3.7.
proc catmod data=ingots;
weight Count;
direct Heat Soak;
model Y=Heat Soak / freq covb corrb;
quit;
Output 22.3.1: Maximum Likelihood Logistic Regression
Maximum Likelihood Logistic Regression 
Response 
Y 
Response Levels 
2 
Weight Variable 
Count 
Populations 
19 
Data Set 
INGOTS 
Total Frequency 
387 
Frequency Missing 
0 
Observations 
25 
Population Profiles 
Sample 
Heat 
Soak 
Sample Size 
1 
7 
1 
10 
2 
7 
1.7 
17 
3 
7 
2.2 
7 
4 
7 
2.8 
12 
5 
7 
4 
9 
6 
14 
1 
31 
7 
14 
1.7 
43 
8 
14 
2.2 
33 
9 
14 
2.8 
31 
10 
14 
4 
19 
11 
27 
1 
56 
12 
27 
1.7 
44 
13 
27 
2.2 
21 
14 
27 
2.8 
22 
15 
27 
4 
16 
16 
51 
1 
13 
17 
51 
1.7 
1 
18 
51 
2.2 
1 
19 
51 
4 
1 

You can verify that the populations are defined as you
intended by looking at the "Population Profiles"
table in Output 22.3.1.
Output 22.3.2: Response Summaries
Maximum Likelihood Logistic Regression 
Response Profiles 
Response 
Y 
1 
0 
2 
1 
Response Frequencies 
Sample 
Response Number 
1 
2 
1 
10 
0 
2 
17 
0 
3 
7 
0 
4 
12 
0 
5 
9 
0 
6 
31 
0 
7 
43 
0 
8 
31 
2 
9 
31 
0 
10 
19 
0 
11 
55 
1 
12 
40 
4 
13 
21 
0 
14 
21 
1 
15 
15 
1 
16 
10 
3 
17 
1 
0 
18 
1 
0 
19 
1 
0 

Since the "Response Profiles" table shows the
response level ordering as 0, 1, the default response
function, the logit, is defined as log([(p_{Y = 0})/(p_{Y = 1})]).
Output 22.3.3: Iteration History
Maximum Likelihood Logistic Regression 
Maximum Likelihood Analysis 
Iteration 
Sub Iteration 
2 Log Likelihood 
Convergence Criterion 
Parameter Estimates 
1 
2 
3 
0 
0 
536.49592 
1.0000 
0 
0 
0 
1 
0 
152.58961 
0.7156 
2.1594 
0.0139 
0.003733 
2 
0 
106.76066 
0.3003 
3.5334 
0.0363 
0.0120 
3 
0 
96.692171 
0.0943 
4.7489 
0.0640 
0.0299 
4 
0 
95.383825 
0.0135 
5.4138 
0.0790 
0.0498 
5 
0 
95.345659 
0.000400 
5.5539 
0.0819 
0.0564 
6 
0 
95.345613 
4.8289E7 
5.5592 
0.0820 
0.0568 
7 
0 
95.345613 
7.73E13 
5.5592 
0.0820 
0.0568 
Maximum likelihood computations converged. 

Seven NewtonRaphson iterations are required to find the
maximum likelihood estimates.
Output 22.3.4: Analysis of Variance Table
Maximum Likelihood Logistic Regression 
Maximum Likelihood Analysis of Variance 
Source 
DF 
ChiSquare 
Pr > ChiSq 
Intercept 
1 
24.65 
<.0001 
Heat 
1 
11.95 
0.0005 
Soak 
1 
0.03 
0.8639 
Likelihood Ratio 
16 
13.75 
0.6171 

The analysis of variance table (Output 22.3.4) shows that
the model fits since the likelihood ratio goodnessoffit
test is nonsignificant. It also shows that the length of
heating time is a significant factor with respect to
readiness but that length of soaking time is not.
Output 22.3.5: Maximum Likelihood Estimates
Maximum Likelihood Logistic Regression 
Analysis of Maximum Likelihood Estimates 
Effect 
Parameter 
Estimate 
Standard Error 
Chi Square 
Pr > ChiSq 
Intercept 
1 
5.5592 
1.1197 
24.65 
<.0001 
Heat 
2 
0.0820 
0.0237 
11.95 
0.0005 
Soak 
3 
0.0568 
0.3312 
0.03 
0.8639 

Output 22.3.6: Covariance Matrix
Maximum Likelihood Logistic Regression 
Covariance Matrix of the Maximum Likelihood Estimates 

1 
2 
3 
1 
1.2537133 
0.0215664 
0.2817648 
2 
0.0215664 
0.0005633 
0.0026243 
3 
0.2817648 
0.0026243 
0.1097020 

Output 22.3.7: Correlation Matrix
Maximum Likelihood Logistic Regression 
Correlation Matrix of the Maximum Likelihood Estimates 

1 
2 
3 
1 
1.00000 
0.81152 
0.75977 
2 
0.81152 
1.00000 
0.33383 
3 
0.75977 
0.33383 
1.00000 

From the table of maximum likelihood estimates
(Output 22.3.5), the fitted model is

E( logit(p)) = 5.559  0.082( Heat)  0.057( Soak)
For example, for Sample 1 with Heat =7 and
Soak =1, the estimate is

E( logit(p)) = 5.559  0.082(7)  0.057(1) = 4.9284
Predicted values of the logits, as well as the probabilities
of readiness, could be obtained by specifying PRED=PROB in
the MODEL statement. For the example of Sample 1 with
Heat =7 and Soak =1, PRED=PROB would give an
estimate of the probability of readiness equal to 0.9928
since
implies that
As another consideration, since soaking time is
nonsignificant, you could fit another model that deleted the
variable Soak.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.