Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOGISTIC Procedure

Example 35.8: Conditional Logistic Regression for Matched Pairs Data

In matched case-control studies, conditional logistic regression is used to investigate the relationship between an outcome of being a case or a control and a set of prognostic factors. When each matched set consists of a single case and a single control, the conditional likelihood is given by

\prod_i(1+\exp(-{\beta}'(x_{i1}-x_{i0}))^{-1}
where xi1 and xi0 are vectors representing the prognostic factors for the case and control, respectively, of the ith matched set. This likelihood is identical to the likelihood of fitting a logistic regression model to a set of data with constant response, where the model contains no intercept term and has explanatory variables given by di = xi1 - xi0 (Breslow 1982).

The data in this example are a subset of the data from the Los Angeles Study of the Endometrial Cancer Data in Breslow and Days (1980). There are 63 matched pairs, each consisting of a case of endometrial cancer (Outcome=1) and a control (Outcome=0). The case and corresponding control have the same ID. Two prognostic factors are included: Gall (an indicator variable for gall bladder disease) and Hyper (an indicator variable for hypertension). The goal of the case-control analysis is to determine the relative risk for gall bladder disease, controlling for the effect of hypertension.

Before PROC LOGISTIC is used for the logistic regression analysis, each matched pair is transformed into a single observation, where the variables Gall and Hyper contain the differences between the corresponding values for the case and the control (case - control). The variable Outcome, which will be used as the response variable in the logistic regression model, is given a constant value of 0 (which is the Outcome value for the control, although any constant, numeric or character, will do).

   data;
      drop id1 gall1 hyper1;
      retain id1 gall1 hyper1 0;
      input ID Outcome Gall Hyper @@ ;
      if (ID = id1) then do;
         Gall=gall1-Gall; Hyper=hyper1-Hyper;
         output;
      end;
      else do;
         id1=ID; gall1=Gall; hyper1=Hyper;
      end;
      datalines;
    1  1 0  0  1  0  0  0   2  1 0  0  2  0  0  0
    3  1 0  1  3  0  0  1   4  1 0  0  4  0  1  0
    5  1 1  0  5  0  0  1   6  1 0  1  6  0  0  0
    7  1 1  0  7  0  0  0   8  1 1  1  8  0  0  1
    9  1 0  0  9  0  0  0  10  1 0  0 10  0  0  0
   11  1 1  0 11  0  0  0  12  1 0  0 12  0  0  1
   13  1 1  0 13  0  0  1  14  1 1  0 14  0  1  0
   15  1 1  0 15  0  0  1  16  1 0  1 16  0  0  0
   17  1 0  0 17  0  1  1  18  1 0  0 18  0  1  1
   19  1 0  0 19  0  0  1  20  1 0  1 20  0  0  0
   21  1 0  0 21  0  1  1  22  1 0  1 22  0  0  1
   23  1 0  1 23  0  0  0  24  1 0  0 24  0  0  0
   25  1 0  0 25  0  0  0  26  1 0  0 26  0  0  1
   27  1 1  0 27  0  0  1  28  1 0  0 28  0  0  1
   29  1 1  0 29  0  0  0  30  1 0  1 30  0  0  0
   31  1 0  1 31  0  0  0  32  1 0  1 32  0  0  0
   33  1 0  1 33  0  0  0  34  1 0  0 34  0  0  0
   35  1 1  1 35  0  1  1  36  1 0  0 36  0  0  1
   37  1 0  1 37  0  0  0  38  1 0  1 38  0  0  1
   39  1 0  1 39  0  0  1  40  1 0  1 40  0  0  0
   41  1 0  0 41  0  0  0  42  1 0  1 42  0  1  0
   43  1 0  0 43  0  0  1  44  1 0  0 44  0  0  0
   45  1 1  0 45  0  0  0  46  1 0  0 46  0  0  0
   47  1 1  1 47  0  0  0  48  1 0  1 48  0  0  0
   49  1 0  0 49  0  0  0  50  1 0  1 50  0  0  1
   51  1 0  0 51  0  0  0  52  1 0  1 52  0  0  1
   53  1 0  1 53  0  0  0  54  1 0  1 54  0  0  0
   55  1 1  0 55  0  0  0  56  1 0  0 56  0  0  0
   57  1 1  1 57  0  1  0  58  1 0  0 58  0  0  0
   59  1 0  0 59  0  0  0  60  1 1  1 60  0  0  0
   61  1 1  0 61  0  1  0  62  1 0  1 62  0  0  0
   63  1 1  0 63  0  0  0
   ;

Note that there are 63 observations in the data set, one for each matched pair. The variable Outcome has a constant value of 0.

In the following SAS statements, PROC LOGISTIC is invoked with the NOINT option to obtain the conditional logistic model estimates. Two models are fitted. The first model contains Gall as the only predictor variable, and the second model contains both Gall and Hyper as predictor variables. Because the option CLODDS=PL is specified, PROC LOGISTIC computes a 95% profile likelihood confidence interval for the odds ratio for each predictor variable.

   proc logistic;
      model outcome=Gall / noint CLODDS=PL;
   run;

   proc logistic;
      model outcome=Gall Hyper / noint CLODDS=PL;
   run;

Results from the two conditional logistic analyses are shown in Output 35.8.1 and Output 35.8.2. Note that there is only one response level listed in the "Response Profile" tables and there is no intercept term in the "Analysis of Maximum Likelihood Estimates" tables.

Output 35.8.1: Conditional Logistic Regression (Gall as risk factor)

The LOGISTIC Procedure

Model Information
Data Set WORK.DATA1
Response Variable Outcome
Number of Response Levels 1
Number of Observations 63
Link Function Logit
Optimization Technique Fisher's scoring

Response Profile
Ordered
Value
Outcome Total
Frequency
1 0 63

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
AIC 87.337 85.654
SC 87.337 87.797
-2 Log L 87.337 83.654

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 3.6830 1 0.0550
Score 3.5556 1 0.0593
Wald 3.2970 1 0.0694

Analysis of Maximum Likelihood Estimates
Variable DF Parameter
Estimate
Standard
Error
Wald
Chi-Square
Pr > ChiSq Standardized
Estimate
Odds
Ratio
Gall 1 0.9555 0.5262 3.2970 0.0694 0.2757 2.600

NOTE: Since there is only one response level, measures of association between
the observed and predicted values were not calculated.

Adjusted Odds Ratios and 95% Confidence Intervals
Variable Unit Odds
Ratio
Profile Likelihood
Confidence Limits
Lower Upper
Gall 1.0000 2.600 0.981 8.103


Output 35.8.2: Conditional Logistic Regression (Gall and Hyper as risk factors)

The LOGISTIC Procedure

Model Information
Data Set WORK.DATA1
Response Variable Outcome
Number of Response Levels 1
Number of Observations 63
Link Function Logit
Optimization Technique Fisher's scoring

Response Profile
Ordered
Value
Outcome Total
Frequency
1 0 63

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.

Model Fit Statistics
Criterion Without
Covariates
With
Covariates
AIC 87.337 86.788
SC 87.337 91.074
-2 Log L 87.337 82.788

Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 4.5487 2 0.1029
Score 4.3620 2 0.1129
Wald 4.0060 2 0.1349

Analysis of Maximum Likelihood Estimates
Variable DF Parameter
Estimate
Standard
Error
Wald
Chi-Square
Pr > ChiSq Standardized
Estimate
Odds
Ratio
Gall 1 0.9704 0.5307 3.3432 0.0675 0.2800 2.639
Hyper 1 0.3481 0.3770 0.8526 0.3558 0.1348 1.416

NOTE: Since there is only one response level, measures of association between
the observed and predicted values were not calculated.

Adjusted Odds Ratios and 95% Confidence Intervals
Variable Unit Odds
Ratio
Profile Likelihood
Confidence Limits
Lower Upper
Gall 1.0000 2.639 0.987 8.299
Hyper 1.0000 1.416 0.682 3.039


In the first model, where Gall is the only predictor variable (Output 35.8.1), the odds ratio estimate for Gall is 2.60, which is an estimate of the relative risk for gall bladder disease. A 95% confidence interval for this relative risk is (0.981, 8.103).

In the second model, where both Gall and Hyper are present (Output 35.8.2), the odds ratio estimate for Gall is 2.639, which is an estimate of the relative risk for gall bladder disease adjusted for the effects of hypertension. A 95% confidence interval for this adjusted relative risk is (0.987, 8.299). Note that the adjusted values for gall bladder disease are not very different from the unadjusted values. This is not surprising since the prognostic factor Hyper is not statistically significant. The 95% profile likelihood confidence interval for the odds ratio for Hyper is (0.682, 3.039), which contains unity.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.