|
Chapter Contents |
Previous |
Next |
| The REG Procedure |
All the variables in the original data set are included in the new data set, along with variables created in the OUTPUT statement. These new variables contain the values of a variety of statistics and diagnostic measures that are calculated for each observation in the data set. If you want to create a permanent SAS data set, you must specify a two-level name (for example, libref.data-set-name). For more information on permanent SAS data sets, refer to the section "SAS Files" in SAS Language Reference: Concepts.
The OUTPUT statement cannot be used when a TYPE=CORR, TYPE=COV, or TYPE=SSCP data set is used as the input data set for PROC REG. See the "Input Data Sets" section for more details.
The statistics created in the OUTPUT statement are described in this section. More details are contained in the "Predicted and Residual Values" section and the "Influence Diagnostics" section. Also see Chapter 3, "Introduction to Regression Procedures," for definitions of the statistics available from the REG procedure.
You can specify the following options in the OUTPUT statement.
In the output data set, the first variable listed after a keyword in the OUTPUT statement contains that statistic for the first dependent variable listed in the MODEL statement; the second variable contains the statistic for the second dependent variable in the MODEL statement, and so on. The list of variables following the equal sign can be shorter than the list of dependent variables in the MODEL statement. In this case, the procedure creates the new names in order of the dependent variables in the MODEL statement.
For example, the SAS statements
proc reg data=a;
model y z=x1 x2;
output out=b
p=yhat zhat
r=yresid zresid;
run;
create an output data set named b.
In addition to the variables in the input data set, b contains the following
variables:
You can specify the following keywords in the OUTPUT statement. See the "Model Fit and Diagnostic Statistics" section for computational formulas.
| Keyword | Description |
| COOKD=names | Cook's D influence statistic |
| COVRATIO=names | standard influence of observation on covariance of betas, as discussed in the "Influence Diagnostics" section |
| DFFITS=names | standard influence of observation on predicted value |
| H=names | leverage, xi(X'X)-1xi' |
| LCL=names | lower bound of a |
| LCLM=names | lower bound of a |
| PREDICTED | P=names | predicted values |
| PRESS=names | ith residual divided by (1-h), where h is the leverage, and where the model has been refit without the ith observation |
| RESIDUAL | R=names | residuals, calculated as ACTUAL minus PREDICTED |
| RSTUDENT=names | a studentized residual with the current observation deleted |
| STDI=names | standard error of the individual predicted value |
| STDP=names | standard error of the mean predicted value |
| STDR=names | standard error of the residual |
| STUDENT=names | studentized residuals, which are the residuals divided by their standard errors |
| UCL=names | upper bound of a |
| UCLM=names | upper bound of a |
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.