|
Chapter Contents |
Previous |
Next |
| The LIFEREG Procedure |
The LIFEREG procedure fits parametric models to failure time data that can be right, left, or interval censored. The models for the response variable consist of a linear effect composed of the covariates and a random disturbance term. The distribution of the random disturbance can be taken from a class of distributions that includes the extreme value, normal, logistic, and, by using a log transformation, the exponential, Weibull, lognormal, loglogistic, and gamma distributions. The model assumed for the response y is

The LIFEREG procedure estimates the parameters by maximum
likelihood using a Newton-Raphson algorithm. PROC LIFEREG
estimates the standard errors of the parameter estimates
from the inverse of the observed information matrix.
The accelerated failure time model assumes that the
effect of independent variables on an event time
distribution is multiplicative on the event time.
Usually, the scale function is
, where
x is the vector of covariate values and
is a vector of unknown parameters.
Thus, if T0 is an event time sampled from the baseline
distribution corresponding to values of zero for the covariates,
then the accelerated failure time model specifies that, if
the vector of covariates is x, the event time
is
.
If y = log(T) and y0 = log(T0), then

In terms of survival or exceedance probabilities, this model is

Usually, an intercept parameter and a scale parameter are allowed in the model. In terms of the original untransformed event times, the effects of the intercept term and the scale term are to scale the event time and power the event time, respectively. That is, if


The parameter estimates for the normal distribution are sensitive to large negative values, and care must be taken that the fitted model is not unduly influenced by them. Likewise, values that are extremely large even after the log transformation have a strong influence in fitting the extreme value (Weibull) and normal distributions. You should examine the residuals and check the effects of removing observations with large residuals or extreme values of covariates on the model parameters. The logistic distribution gives robust parameter estimates in the sense that the estimates have a bounded influence function.
The standard errors of the parameter estimates are
computed from large sample normal approximations
using the observed information matrix.
In small samples, these approximations may be poor.
Refer to Lawless (1982) for additional discussion and references.
You can sometimes construct better confidence intervals
by transforming the parameters.
For example, large sample
theory is often more accurate for
than
.
Therefore, it may be more accurate to construct
confidence intervals for
and transform
these into confidence intervals for
.
The parameter estimates and their
estimated covariance matrix are available in an output
SAS data set and can be used to construct additional
tests or confidence intervals for the parameters.
Alternatively, tests of parameters
can be based on log-likelihood ratios.
Refer to Cox and Oakes (1984) for a discussion of the
merits of some possible test methods including
score, Wald, and likelihood ratio tests.
It is believed that likelihood ratio tests
are generally more reliable in small samples
than tests based on the information matrix.
The log-likelihood function is computed using
the log of the failure time as a response.
This log likelihood differs from the log likelihood
obtained using the failure time as the response
by an additive term of
, where
the sum is over the noncensored failure times.
This term does not depend on the unknown parameters and
does not affect parameter or standard error estimates.
However, many published values of log likelihoods use the failure
time as the basic response variable and, hence, differ by the
additive term from the value computed by the LIFEREG procedure.
The classic Tobit model (Tobin 1958) also fits into this class of models but with data usually censored on the left. The data considered by Tobin in his original paper came from a survey of consumers where the response variable is the ratio of expenditures on durable goods to the total disposable income. The two explanatory variables are the age of the head of household and the ratio of liquid assets to total disposable income. Because many observations in this data set have a value of zero for the response variable, the model fit by Tobin is

|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.