Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
SAS Macros and Functions

Details

Theoretical Background

When a time series has a unit root, the series is nonstationary and the ordinary least squares (OLS) estimator is not normally distributed. Dickey (1976) and Dickey and Fuller (1979) studied the limiting distribution of the OLS estimator of autoregressive models for time series with a simple unit root. Dickey, Hasza and Fuller (1984) obtained the limiting distribution for time series with seasonal unit roots.

Consider the (p+1)th order autoregressive time series

Y_{t} = {\alpha}_{1}Y_{t-1} + {\alpha}_{2}Y_{t-2}
 + { ... } + {\alpha}_{p+1}Y_{t-p-1} + e_{t}

and its characteristic equation

m^{p+1} -{\alpha}_{1}m^p - {\alpha}_{2}m^{p-1}
 - { ... } - {\alpha}_{p+1} = 0

If all the characteristic roots are less than 1 in absolute value, Yt is stationary. Yt is nonstationary if there is a unit root. If there is a unit root, the sum of the autoregressive parameters is 1, and, hence, you can test for a unit root by testing whether the sum of the autoregressive parameters is 1 or not. For convenience, the model is parameterized as

{\nabla}Y_{t} = {\delta}Y_{t-1} + {\theta}_{1}{\nabla}Y_{t-1}
+ { ... } + {\theta}_{p}{\nabla}Y_{t-p} + e_{t}

where {{\nabla}Y_{t}=Y_{t}-Y_{t-1}} and

{\delta} = {\alpha}_{1}+{ ... }+{\alpha}_{p+1}-1
{\theta}_{k} = -{\alpha}_{k+1}-{ ... }-{\alpha}_{p+1}

The estimators are obtained by regressing {{\nabla}Y_{t}} on {Y_{t-1},{\nabla}Y_{t-1},{ ... },{\nabla}Y_{t-p}}.The t statistic of the ordinary least squares estimator of {\delta} is the test statistic for the unit root test.

If the TREND=1 option is used, the autoregressive model includes a mean term {{\alpha}_{0}}.If TREND=2, the model also includes a time trend term and the model is as follows:

{\nabla}Y_{t} = {\alpha}_{0} + {\gamma}t +
{\delta}Y_{t-1} + {\theta}_{1}{\nabla}Y_{t-1}
+ { ... } + {\theta}_{p}{\nabla}Y_{t-p} + e_{t}

For testing for a seasonal unit root, consider the multiplicative model

(1-{\alpha}_{d}B^d)(1-{\theta}_{1}B-{ ... }
-{\theta}_{p}B^p)Y_{t}=e_{t}

Let {{\nabla}^dY_{t} {\equiv} Y_{t}-Y_{t-d}}.The test statistic is calculated in the following steps:

  1. Regress {{\nabla}^dY_{t}} on {{\nabla}^dY_{t-1}{ ... }{\nabla}^dY_{t-p}}to obtain the initial estimators {\hat{{\theta}}_{i}} and compute residuals {\hat{e}_{t}}.Under the null hypothesis that {{\alpha}_{d}=1},{\hat{{\theta}}_{i}} are consistent estimators of {{\theta}_{i}}.
  2. Regress {\hat{e}_{t}} on {(1-\hat{{\theta}}_{1}B-{ ... }-\hat{{\theta}}_{p}B^p)Y_{t-d},
{\nabla}^dY_{t-1}}, ..., {{\nabla}^dY_{t-p}}to obtain estimates of {{\delta}={\alpha}_{d}-1} and {{\theta}_{i}-\hat{{\theta}}_{i}}.

The t ratio for the estimate of {\delta} produced by the second step is used as a test statistic for testing for a seasonal unit root. The estimates of {{\theta}_{i}} are obtained by adding the estimates of {{\theta}_{i}-\hat{{\theta}}_{i}} from the second step to {\hat{{\theta}}_{i}} from the first step. The estimates of {{\alpha}_{d}-1} and {{\theta}_{i}}are saved in the OUTSTAT= data set if the OUTSTAT= option is specified.

The series (1-Bd)Yt is assumed to be stationary, where d is the value of the DLAG= option.

If the OUTSTAT= option is specified, the OUTSTAT= data set contains estimates {\hat{{\delta}}, \hat{{\theta}}_{1},{ ... }, \hat{{\theta}}_{p}}.

If the series is an ARMA process, a large value of the AR= option may be desirable in order to obtain a reliable test statistic. To determine an appropriate value for the AR= option for an ARMA process, refer to Said and Dickey (1984).

Test Statistics

The Dickey-Fuller test is used to test the null hypothesis that the time series exhibits a lag d unit root against the alternative of stationarity. The PROBDF function computes the probability of observing a test statistic more extreme than x under the assumption that the null hypothesis is true. You should reject the unit root hypothesis when PROBDF returns a small (significant) probability value.

There are several different versions of the Dickey-Fuller test. The PROBDF function supports six versions, as selected by the type argument. Specify the type value that corresponds to the way that you calculated the test statistic x.

The last two characters of the type value specify the kind of regression model used to compute the Dickey-Fuller test statistic. The meaning of the last two characters of the type value are as follows.

ZM
zero mean or no intercept case. The test statistic x is assumed to be computed from the regression model
y_{t} = {\alpha}_{d}y_{t-d}+e_{t}

SM
single mean or intercept case. The test statistic x is assumed to be computed from the regression model
y_{t} = {\alpha}_{0}+{\alpha}_{d}y_{t-d}+e_{t}

TR
intercept and deterministic time trend case. The test statistic x is assumed to be computed from the regression model
y_{t} = {\alpha}_{0}+{\gamma} t+{\alpha}_{1}y_{t-1}+e_{t}

The first character of the type value specifies whether the regression test statistic or the studentized test statistic is used. Let {\hat{{\alpha}}_{d}} be the estimated regression coefficient for the dth lag of the series, and let {\rm{se}_{\hat{{\alpha}}}} be the standard error of {\hat{{\alpha}}_{d}}.The meaning of the first character of the type value is as follows.

R
the regression coefficient-based test statistic. The test statistic is
x = n(\hat{{\alpha}}_{d}-1)

S
the studentized test statistic. The test statistic is
x = \frac{(\hat{{\alpha}}_{d}-1)}{\rm{se}_{\hat{{\alpha}}}}

Refer to Dickey and Fuller (1979) and Dickey, Hasza, and Fuller (1984) for more information about the Dickey-Fuller test null distribution. The preceding formulas are for the basic Dickey-Fuller test. The PROBDF function can also be used for the augmented Dickey-Fuller test, in which the error term et is modeled as an autoregressive process; however, the test statistic is computed somewhat differently for the augmented Dickey-Fuller test. Refer to Dickey, Hasza, and Fuller (1984) and Hamilton (1994) for information about seasonal and nonseasonal augmented Dickey-Fuller tests.

The PROBDF function is calculated from approximating functions fit to empirical quantiles produced by Monte Carlo simulation employing 108 replications for each simulation. Separate simulations were performed for selected values of n and for d=1,2,4,6,12.

The maximum error of the PROBDF function is approximately {{+-}10^{-3}} for d in the set (1,2,4,6,12) and may be slightly larger for other d values. (Because the number of simulation replications used to produce the PROBDF function is much greater than the 60,000 replications used by Dickey and Fuller (1979) and Dickey, Hasza, and Fuller (1984), the PROBDF function can be expected to produce results that are substantially more accurate than the critical values reported in those papers.)

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.