Chapter Contents 
Previous 
Next 
The LOGISTIC Procedure 
Often survival times are not observed more precisely than the interval (for instance, a day) within which the event occurred. Survival data of this form are known as grouped or intervalcensored data. A discrete analogue of the continuous proportional hazards model (Prentice and Gloeckler 1978; Allison 1982) is used to investigate the relationship between these survival times and a set of explanatory variables.
Suppose T_{i} is the discrete survival time variable of the ith subject with covariates x_{i}. The discretetime hazard rate is defined as
Using elementary properties of conditional probabilities, it can be shown that
Suppose t_{i} is the observed survival time of the ith subject. Suppose if T_{i}=t_{i} is an event time and 0 otherwise. The likelihood for the grouped survival data is given by
Note that the likelihood L for the grouped survival data is the same as the likelihood of a binary response model with event probabilities .If the data are generated by a continuoustime proportional hazards model, Prentice and Gloeckler (1978) have shown that
Consider a study of the effect of insecticide on flourbeetles. Four different concentrations of an insecticide were sprayed on separate groups of flourbeetles. The numbers of male and female flourbeetles dying in successive intervals were saved in the data set beetles.
data beetles(keep=time sex conc freq); input time m20 f20 m32 f32 m50 f50 m80 f80; conc=.20; freq= m20; sex=1; output; freq= f20; sex=2; output; conc=.32; freq= m32; sex=1; output; freq= f32; sex=2; output; conc=.50; freq= m50; sex=1; output; freq= f50; sex=2; output; conc=.80; freq= m80; sex=1; output; freq= f80; sex=2; output; datalines; 1 3 0 7 1 5 0 4 2 2 11 2 10 5 8 4 10 7 3 10 4 11 11 11 6 8 15 4 7 8 16 10 15 6 14 9 5 4 9 3 5 4 3 8 3 6 3 3 2 1 2 1 2 4 7 2 0 1 0 1 1 1 1 8 1 0 0 1 1 4 0 1 9 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 1 1 11 0 0 0 0 1 1 0 0 12 1 0 0 0 0 1 0 0 13 1 0 0 0 0 1 0 0 14 101 126 19 47 7 17 2 4 ;
The data set beetles contains four variables: time, sex, conc, and freq. time represents the interval death time; for example, time=2 is the interval between day 1 and day 2. Insects surviving the duration (13 days) of the experiment are given a time value of 14. The variable sex represents the sex of the insects (1=male, 2=female), conc represents the concentration of the insecticide (mg/cm^{2}), and freq represents the frequency of the observations.
To use PROC LOGISTIC with the grouped survival data, you must expand the data so that each beetle has a separate record for each day of survival. A beetle that died in the third day (time=3) would contribute three observations to the analysis, one for each day it was alive at the beginning of the day. A beetle that survives the 13day duration of the experiment (time=14) would contribute 13 observations.
A new data set days that contains the beetleday observations is created from the data set beetles. In addition to the variables sex, conc and freq, the data set contains an outcome variable y and 13 indicator variables day1, day2, ..., day13. y has a value of 1 if the observation corresponds to the day that the beetle died and has a value of 0 otherwise. An observation for the first day will have a value of 1 for day1 and a value of 0 for day2 day13; an observation for the second day will have a value of 1 for day2 and a value of 0 for day1 and day2 day13. For instance, Output 39.11.1 shows an observation in the beetles data set with time=3, and Output 39.11.2 shows the corresponding beetleday observations in the data set days.
data days; retain day1day13 0; array dd[13] day1day13; set beetles; if time = 14 then do day=1 to 13; y=0; dd[day]=1; output; dd[day]=0; end; else do day=1 to time; if day=time then y=1; else y=0; dd[day]=1; output; dd[day]=0; end;Output 39.11.1: An Observation with Time=3 in Data Set Beetles

proc logistic data=days descending outest=est1; model y= day1day13 sex conc / noint link=cloglog technique=newton; freq freq; run;Output 39.11.3: Parameter Estimates for the Grouped Proportional Hazards Model

The coefficients of day1 day13 are the maximum likelihood estimates of , respectively. The baseline survivor function S_{0}(t) is estimated by
The following DATA step computes the survivor curves for male and female flourbeetles exposed to the insecticide of concentrations .20 mg/cm^{2} and .80 mg/cm^{2}. The GPLOT procedure in SAS/GRAPH software is used to plot the survival curves. Instead of plotting them as step functions, the SPLINE option is used to smooth the curves. These smoothed survival curves are displayed in Output 39.11.4.
data one (keep=day survival element s_m20 s_f20 s_m80 s_f80); array dd day1day13; array sc[4] m20 f20 m80 f80; array s_sc[4] s_m20 s_f20 s_m80 s_f80 (1 1 1 1); set est1; m20= exp(sex + .20 * conc); f20= exp(2 * sex + .20 * conc); m80= exp(sex + .80 * conc); f80= exp(2 * sex + .80 * conc); survival=1; day=0; output; do over dd; element= exp(exp(dd)); survival= survival * element; do i=1 to 4; s_sc[i] = survival ** sc[i]; end; day + 1; output; end; label s_m20= 'Male at .20 conc.' s_m80= 'Male at .80 conc.' s_f20= 'Female at .20 conc.' s_f80= 'Female at .80 conc.'; run; title1 'FlourBeetles Sprayed with Insecticide'; legend1 label=none frame cframe=ligr cborder=black position=center value=(justify=center); axis1 label=(angle=90 'Survival Function'); proc gplot data=one; plot (s_m20 s_f20 s_m80 s_f80) * day / overlay legend=legend1 vaxis=axis1 cframe=ligr; symbol1 v=dot i=spline c=black height=.8; symbol2 v=dot i=spline c=red height=.8; symbol3 v=dot i=spline c=blue height=.8; symbol4 v=dot i=spline c=yellow height=.8; run;
The probability of survival is displayed on the vertical axis. Notice that most of the insecticide effect occurs by day 6 for both the high and low concentrations.
Output 39.11.4: Predicted Survival at Concentrations of 0.20 and 0.80 mg/cm^{2}
Chapter Contents 
Previous 
Next 
Top 
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.