Chapter Contents Previous Next
 The PHREG Procedure

## Example 49.1: Stepwise Regression

Krall, Uthoff, and Harley (1975) analyzed data from a study on multiple myeloma in which researchers treated 65 patients with alkylating agents. Of those patients, 48 died during the study and 17 survived. In the data set Myeloma, the variable Time represents the survival time in months from diagnosis. The variable VStatus consists of two values, 0 and 1, indicating whether the patient was alive or dead, respectively, at the end of the study. If the value of VStatus is 0, the corresponding value of Time is censored. The variables thought to be related to survival are LogBUN (log(BUN) at diagnosis), HGB (hemoglobin at diagnosis), Platelet (platelets at diagnosis: 0=abnormal, 1=normal), Age (age at diagnosis in years), LogWBC (log(WBC) at diagnosis), Frac (fractures at diagnosis: 0=none, 1=present), LogPBM (log percentage of plasma cells in bone marrow), Protein (proteinuria at diagnosis), and SCalc (serum calcium at diagnosis). Interest lies in identifying important prognostic factors from these nine explanatory variables.

```   data Myeloma;
input Time VStatus LogBUN HGB Platelet Age LogWBC Frac
LogPBM Protein SCalc;
label Time='Survival Time'
datalines;
1.25  1  2.2175   9.4  1  67  3.6628  1  1.9542  12  10
1.25  1  1.9395  12.0  1  38  3.9868  1  1.9542  20  18
2.00  1  1.5185   9.8  1  81  3.8751  1  2.0000   2  15
2.00  1  1.7482  11.3  0  75  3.8062  1  1.2553   0  12
2.00  1  1.3010   5.1  0  57  3.7243  1  2.0000   3   9
3.00  1  1.5441   6.7  1  46  4.4757  0  1.9345  12  10
5.00  1  2.2355  10.1  1  50  4.9542  1  1.6628   4   9
5.00  1  1.6812   6.5  1  74  3.7324  0  1.7324   5   9
6.00  1  1.3617   9.0  1  77  3.5441  0  1.4624   1   8
6.00  1  2.1139  10.2  0  70  3.5441  1  1.3617   1   8
6.00  1  1.1139   9.7  1  60  3.5185  1  1.3979   0  10
6.00  1  1.4150  10.4  1  67  3.9294  1  1.6902   0   8
7.00  1  1.9777   9.5  1  48  3.3617  1  1.5682   5  10
7.00  1  1.0414   5.1  0  61  3.7324  1  2.0000   1  10
7.00  1  1.1761  11.4  1  53  3.7243  1  1.5185   1  13
9.00  1  1.7243   8.2  1  55  3.7993  1  1.7404   0  12
11.00  1  1.1139  14.0  1  61  3.8808  1  1.2788   0  10
11.00  1  1.2304  12.0  1  43  3.7709  1  1.1761   1   9
11.00  1  1.3010  13.2  1  65  3.7993  1  1.8195   1  10
11.00  1  1.5682   7.5  1  70  3.8865  0  1.6721   0  12
11.00  1  1.0792   9.6  1  51  3.5051  1  1.9031   0   9
13.00  1  0.7782   5.5  0  60  3.5798  1  1.3979   2  10
14.00  1  1.3979  14.6  1  66  3.7243  1  1.2553   2  10
15.00  1  1.6021  10.6  1  70  3.6902  1  1.4314   0  11
16.00  1  1.3424   9.0  1  48  3.9345  1  2.0000   0  10
16.00  1  1.3222   8.8  1  62  3.6990  1  0.6990  17  10
17.00  1  1.2304  10.0  1  53  3.8808  1  1.4472   4   9
17.00  1  1.5911  11.2  1  68  3.4314  0  1.6128   1  10
18.00  1  1.4472   7.5  1  65  3.5682  0  0.9031   7   8
19.00  1  1.0792  14.4  1  51  3.9191  1  2.0000   6  15
19.00  1  1.2553   7.5  0  60  3.7924  1  1.9294   5   9
24.00  1  1.3010  14.6  1  56  4.0899  1  0.4771   0   9
25.00  1  1.0000  12.4  1  67  3.8195  1  1.6435   0  10
26.00  1  1.2304  11.2  1  49  3.6021  1  2.0000  27  11
32.00  1  1.3222  10.6  1  46  3.6990  1  1.6335   1   9
35.00  1  1.1139   7.0  0  48  3.6532  1  1.1761   4  10
37.00  1  1.6021  11.0  1  63  3.9542  0  1.2041   7   9
41.00  1  1.0000  10.2  1  69  3.4771  1  1.4771   6  10
41.00  1  1.1461   5.0  1  70  3.5185  1  1.3424   0   9
51.00  1  1.5682   7.7  0  74  3.4150  1  1.0414   4  13
52.00  1  1.0000  10.1  1  60  3.8573  1  1.6532   4  10
54.00  1  1.2553   9.0  1  49  3.7243  1  1.6990   2  10
58.00  1  1.2041  12.1  1  42  3.6990  1  1.5798  22  10
66.00  1  1.4472   6.6  1  59  3.7853  1  1.8195   0   9
67.00  1  1.3222  12.8  1  52  3.6435  1  1.0414   1  10
88.00  1  1.1761  10.6  1  47  3.5563  0  1.7559  21   9
89.00  1  1.3222  14.0  1  63  3.6532  1  1.6232   1   9
92.00  1  1.4314  11.0  1  58  4.0755  1  1.4150   4  11
4.00  0  1.9542  10.2  1  59  4.0453  0  0.7782  12  10
4.00  0  1.9243  10.0  1  49  3.9590  0  1.6232   0  13
7.00  0  1.1139  12.4  1  48  3.7993  1  1.8573   0  10
7.00  0  1.5315  10.2  1  81  3.5911  0  1.8808   0  11
8.00  0  1.0792   9.9  1  57  3.8325  1  1.6532   0   8
12.00  0  1.1461  11.6  1  46  3.6435  0  1.1461   0   7
11.00  0  1.6128  14.0  1  60  3.7324  1  1.8451   3   9
12.00  0  1.3979   8.8  1  66  3.8388  1  1.3617   0   9
13.00  0  1.6628   4.9  0  71  3.6435  0  1.7924   0   9
16.00  0  1.1461  13.0  1  55  3.8573  0  0.9031   0   9
19.00  0  1.3222  13.0  1  59  3.7709  1  2.0000   1  10
19.00  0  1.3222  10.8  1  69  3.8808  1  1.5185   0  10
28.00  0  1.2304   7.3  1  82  3.7482  1  1.6721   0   9
41.00  0  1.7559  12.8  1  72  3.7243  1  1.4472   1   9
53.00  0  1.1139  12.0  1  66  3.6128  1  2.0000   1  11
57.00  0  1.2553  12.5  1  66  3.9685  0  1.9542   0  11
77.00  0  1.0792  14.0  1  60  3.6812  0  0.9542   0  12
;
```

The stepwise selection process consists of a series of alternating step-up and step-down phases. The former adds variables to the model, while the latter removes variables from the model.

Stepwise regression analysis is requested by specifying the SELECTION=STEPWISE option in the MODEL statement. The option SLENTRY=0.25 specifies that a variable has to be significant at the 0.25 level before it can be entered into the model, while the option SLSTAY=0.15 specifies that a variable in the model has to be significant at the 0.15 level for it to remain in the model. The DETAILS option requests detailed results for the variable selection process.

```   proc phreg data=Myeloma;
model Time*VStatus(0)=LogBUN HGB Platelet Age LogWBC
Frac LogPBM Protein SCalc
/ selection=stepwise slentry=0.25
slstay=0.15 details;
run;
```

Results of the stepwise regression analysis are displayed in Output 49.1.1 through Output 49.1.7.

Output 49.1.1: Individual Score Test Results for all Variables

 The PHREG Procedure

 Model Information Data Set WORK.MYELOMA Dependent Variable Time Survival Time Censoring Variable VStatus 0=Alive 1=Dead Censoring Value(s) 0 Ties Handling BRESLOW

 Summary of the Number of Event and CensoredValues Total Event Censored PercentCensored 65 48 17 26.15

 Analysis of Variables Not inthe Model Variable ScoreChi-Square Pr > ChiSq LogBUN 8.5164 0.0035 HGB 5.0664 0.0244 Platelet 3.1816 0.0745 Age 0.0183 0.8924 LogWBC 0.5658 0.4519 Frac 0.9151 0.3388 LogPBM 0.5846 0.4445 Protein 0.1466 0.7018 SCalc 1.1109 0.2919

 Residual Chi-Square Test Chi-Square DF Pr > ChiSq 18.4550 9 0.0302

Output 49.1.2: First Model in the Stepwise Selection Process

 The PHREG Procedure

 Step 1. Variable LogBUN is entered. The model contains the following explanatory variables:

 LogBUN

 Convergence Status Convergence criterion (GCONV=1E-8) satisfied.

 Model Fit Statistics Criterion WithoutCovariates WithCovariates -2 LOG L 309.716 301.959 AIC 309.716 303.959 SBC 309.716 305.830

 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 7.7572 1 0.0053 Score 8.5164 1 0.0035 Wald 8.3392 1 0.0039

 Analysis of Maximum Likelihood Estimates Variable DF ParameterEstimate StandardError Chi-Square Pr > ChiSq HazardRatio LogBUN 1 1.74595 0.60460 8.3392 0.0039 5.731

Individual score tests are used to determine which of the nine explanatory variables is first selected into the model. In this case, the score test for each variable is the global score test for the model containing that variable as the only explanatory variable. The chi-squared statistic is compared to a chi-squared distribution with one degree of freedom. Output 49.1.1 displays the chi-squared statistics and the corresponding p-values. The variable LogBUN has the largest chi-squared value (8.5164), and it is significant (p=0.0035) at the SLENTRY=0.25 level. The variable LogBUN is thus entered into the model. Output 49.1.2 displays the model results. Since the Wald chi-squared statistic is significant (p=0.0039) at the SLSTAY=0.15 level, LogBUN stays in the model.

Output 49.1.3: Score Tests Adjusted for the Variable LogBUN

 Analysis of Variables Not inthe Model Variable ScoreChi-Square Pr > ChiSq HGB 4.3468 0.0371 Platelet 2.0183 0.1554 Age 0.7159 0.3975 LogWBC 0.0704 0.7908 Frac 1.0354 0.3089 LogPBM 1.0334 0.3094 Protein 0.5214 0.4703 SCalc 1.4150 0.2342

 Residual Chi-Square Test Chi-Square DF Pr > ChiSq 9.3164 8 0.3163

Output 49.1.4: Second Model in the Stepwise Selection Process

 Step 2. Variable HGB is entered. The model contains the following explanatory variables:

 LogBUN HGB

 Convergence Status Convergence criterion (GCONV=1E-8) satisfied.

 Model Fit Statistics Criterion WithoutCovariates WithCovariates -2 LOG L 309.716 297.767 AIC 309.716 301.767 SBC 309.716 305.509

 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 11.9493 2 0.0025 Score 12.7252 2 0.0017 Wald 12.1900 2 0.0023

 Analysis of Maximum Likelihood Estimates Variable DF ParameterEstimate StandardError Chi-Square Pr > ChiSq HazardRatio LogBUN 1 1.67440 0.61209 7.4833 0.0062 5.336 HGB 1 -0.11899 0.05751 4.2811 0.0385 0.888

The next step consists of selecting another variable to add to the model. Output 49.1.3 displays the chi-squared statistics and p-values of individual score tests (adjusted for LogBUN) for the remaining eight variables. The score chi-square for a given variable is the value of the likelihood score test for testing the significance of the variable in the presence of LogBUN. The variable HGB is selected because it has the highest chi-squared value (4.3468), and it is significant (p=0.0371) at the SLENTRY=0.25 level. Output 49.1.4 displays the fitted model containing both LogBUN and HGB. Based on the Wald statistics, neither LogBUN nor HGB is removed from the model.

Output 49.1.5: Third Model in the Stepwise Regression

 Step 3. Variable SCalc is entered. The model contains the following explanatory variables:

 LogBUN HGB SCalc

 Convergence Status Convergence criterion (GCONV=1E-8) satisfied.

 Model Fit Statistics Criterion WithoutCovariates WithCovariates -2 LOG L 309.716 296.078 AIC 309.716 302.078 SBC 309.716 307.692

 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 13.6377 3 0.0034 Score 15.3053 3 0.0016 Wald 14.4542 3 0.0023

 Analysis of Maximum Likelihood Estimates Variable DF ParameterEstimate StandardError Chi-Square Pr > ChiSq HazardRatio LogBUN 1 1.63593 0.62359 6.8822 0.0087 5.134 HGB 1 -0.12643 0.05868 4.6419 0.0312 0.881 SCalc 1 0.13286 0.09868 1.8127 0.1782 1.142

Output 49.1.5 shows Step 3 of the selection process, in which the variable SCalc is added, resulting in the model with LogBUN, HGB, and SCalc as the explanatory variables. Note that SCalc has the smallest Wald chi-squared statistic, and it is not significant (p=0.1782) at the SLSTAY=0.15 level. The variable SCalc is then removed from the model in a step-down phase in Step 4 (Output 49.1.6). The removal of SCalc brings the stepwise selection process to a stop in order to avoid repeatedly entering and removing the same variable.

The procedure also displays a summary table of the steps in the stepwise selection process, as shown in Output 49.1.7. The stepwise selection process results in a model with two explanatory variables, LogBUN and HGB.

Output 49.1.6: Final Model in the Stepwise Regression

 Step 4. Variable SCalc is removed. The model contains the following explanatory variables:

 LogBUN HGB

 Convergence Status Convergence criterion (GCONV=1E-8) satisfied.

 Model Fit Statistics Criterion WithoutCovariates WithCovariates -2 LOG L 309.716 297.767 AIC 309.716 301.767 SBC 309.716 305.509

 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 11.9493 2 0.0025 Score 12.7252 2 0.0017 Wald 12.1900 2 0.0023

 Analysis of Maximum Likelihood Estimates Variable DF ParameterEstimate StandardError Chi-Square Pr > ChiSq HazardRatio LogBUN 1 1.67440 0.61209 7.4833 0.0062 5.336 HGB 1 -0.11899 0.05751 4.2811 0.0385 0.888

 NOTE: Model building terminates because the variable to be entered is the variable that was removed in the last step.

Output 49.1.7: Model Selection Summary

 Summary of Stepwise Selection Step Variable Number In ScoreChi-Square WaldChi-Square Pr > ChiSq Entered Removed 1 LogBUN 1 8.5164 . 0.0035 2 HGB 2 4.3468 . 0.0371 3 SCalc 3 1.8225 . 0.1770 4 SCalc 2 . 1.8127 0.1782

 Chapter Contents Previous Next Top