Chapter Contents |
Previous |
Next |

The TRANSREG Procedure |

**OUTPUT****OUT=**SAS-data-set < o-options >**;**

To specify the data set, use the OUT= specification.

**OUT=***SAS-data-set*-
specifies the output data set for the data, transformed data, predicted
values, residuals, scores, coefficients, and so on.
When you use an OUTPUT statement but do not
use the OUT= specification, PROC TRANSREG creates
a data set and uses the DATAn convention.
If you want to create a permanent SAS data set, you must specify
a two-level name (refer to "SAS Files" in
*SAS Language Reference: Concepts*and "Introduction to DATA Step Processing" in the*SAS Procedures Guide*for details).

To control the contents of the data set and variable names, use one or more of the*o-options*. You can also specify these options in the PROC TRANSREG statement.

Task |
Option |

Identify output data set | |

output data set | OUT= |

Predicted values, residuals, scores | |

outputs canonical scores | CANONICAL |

outputs individual confidence limits | CLI |

outputs mean confidence limits | CLM |

specifies design matrix coding | DESIGN= |

outputs leverage | LEVERAGE |

does not restore missings | NORESTOREMISSING |

suppresses output of scores | NOSCORES |

outputs predicted values | PREDICTED |

outputs redundancy variables | REDUNDANCY= |

outputs residuals | RESIDUALS |

Output data set replacement | |

replaces dependent variables | DREPLACE |

replaces independent variables | IREPLACE |

replaces all variables | REPLACE |

Output data set coefficients | |

outputs coefficients | COEFFICIENTS |

outputs ideal point coordinates | COORDINATES |

outputs marginal means | MEANS |

outputs redundancy analysis coefficients | MREDUNDANCY |

Output data set variable name prefixes | |

dependent variable approximations | ADPREFIX= |

independent variable approximations | AIPREFIX= |

canonical dependent variables | CDPREFIX= |

conservative individual lower CL | CILPREFIX= |

canonical independent variables | CIPREFIX= |

conservative-individual-upper CL | CIUPREFIX= |

conservative-mean-lower CL | CMLPREFIX= |

conservative-mean-upper CL | CMUPREFIX= |

METHOD=MORALS untransformed dependent | DEPENDENT= |

liberal-individual-lower CL | LILPREFIX= |

liberal-individual-upper CL | LIUPREFIX= |

liberal-mean-lower CL | LMLPREFIX= |

liberal-mean-upper CL | LMUPREFIX= |

residuals | RDPREFIX= |

predicted values | PPREFIX= |

redundancy variables | RPREFIX= |

transformed dependents | TDPREFIX= |

transformed independents | TIPREFIX= |

Output data set macros | |

creates macro variables | MACRO |

Control CLASS variables | |

controls output of reference levels | REFERENCE= |

Output data set details | |

dependent and independent approximations | APPROXIMATIONS |

canonical correlation coefficients | CCC |

canonical elliptical point coordinate | CEC |

canonical point coordinates | CPC |

canonical quadratic point coordinates | CQC |

approximations to transformed dependents | DAPPROXIMATIONS |

approximations to transformed independents | IAPPROXIMATIONS |

elliptical point coordinates | MEC |

point coordinates | MPC |

quadratic point coordinates | MQC |

multiple regression coefficients | MRC |

For the coefficients partition, the COEFFICIENTS, COORDINATES, and MEANS

The following list provides details on these options.

**ADPREFIX=***name***ADP=***name*-
specifies a prefix for naming the
dependent variable predicted values.
The default is ADPREFIX=P when you
specify the PREDICTED
*o-option*; otherwise, it is ADPREFIX=A. Specifying the ADPREFIX=*o-option*also implies the PREDICTED*o-option*, and the ADPREFIX=*o-option*is the same as the PPREFIX=*o-option*. **AIPREFIX=***name***AIP=***name*-
specifies a prefix for naming the
independent variable approximations.
The default is AIPREFIX=A.
Specifying the AIPREFIX=
*o-option*also implies the IAPPROXIMATIONS*o-option*. **APPROXIMATIONS****APPROX****APP**-
is equivalent to specifying both
the DAPPROXIMATIONS and the IAPPROXIMATIONS
*o-options*. If METHOD=UNIVARIATE, then the APPROXIMATIONS*o-option*implies only the DAPPROXIMATIONS*o-option*. **CANONICAL****CAN**-
outputs canonical variables to the OUT= data set. When METHOD=CANALS,
the CANONICAL
*o-option*is implied. The CDPREFIX=*o-option*specifies a prefix for naming the dependent canonical variables (default Cand), and the CIPREFIX=*o-option*specifies a prefix for naming the independent canonical variables (default Cani). **CCC**- outputs canonical correlation coefficients to the OUT= data set.
**CDPREFIX=***name***CDP=***name*-
provides a prefix for naming the canonical
dependent variables. The default is CDPREFIX=Cand.
Specifying the CDPREFIX=
*o-option*also implies the CANONICAL*o-option*. **CEC**-
outputs canonical elliptical point model coordinates to the OUT= data set.
**CILPREFIX=***name***CIL=***name*-
specifies a prefix for naming the conservative-individual-lower
confidence limits. The default prefix is CIL.
Specifying the CILPREFIX=
*o-option*also implies the CLI*o-option*. **CIPREFIX=***name***CIP=***name*-
provides a prefix for naming the canonical
independent variables. The default is CIPREFIX=Cani.
Specifying the CIPREFIX=
*o-option*also implies the CANONICAL*o-option*. **CIUPREFIX=***name***CIU=***name*-
specifies a prefix for naming the conservative-individual-upper
confidence limits. The default prefix is CIU.
Specifying the CIUPREFIX=
*o-option*also implies the CLI*o-option*. **CLI**-
outputs individual confidence limits to the OUT= data set.
The names of the confidence limits variables are
constructed from the original dependent variable names
and the prefixes specified in
the following
*o-options*: LILPREFIX= (default LIL for liberal individual lower), CILPREFIX= (default CIL for conservative individual lower), LIUPREFIX= (default LIU for liberal individual upper), and CIUPREFIX= (default CIU for conservative individual upper). When there are no monotonicity constraints, the liberal and conservative limits are the same. **CLM**-
outputs mean confidence limits to the OUT= data set.
The names of the confidence limits variables are
constructed from the original dependent variable names
and the prefixes specified in
the following
*o-options*: LMLPREFIX= (default LML for liberal mean lower), CMLPREFIX= (default CML for conservative mean lower), LMUPREFIX= (default LMU for liberal mean upper), and CMUPREFIX= (default CMU for conservative mean upper). When there are no monotonicity constraints, the liberal and conservative limits are the same. **CMLPREFIX=***name***CML=***name*-
specifies a prefix for naming the conservative-mean-lower confidence
limits. The default prefix is CML.
Specifying the CMLPREFIX=
*o-option*also implies the CLM*o-option*. **CMUPREFIX=***name***CMU=***name*-
specifies a prefix for naming the conservative-mean-upper confidence
limits. The default prefix is CMU.
Specifying the CMUPREFIX=
*o-option*also implies the CLM*o-option*. **COEFFICIENTS****COE**-
outputs either
multiple regression coefficients
or raw canonical coefficients to the OUT= data set.
If you specify METHOD=CANALS (in the MODEL or PROC TRANSREG
statement), then the COEFFICIENTS
*o-option*outputs the first*n*canonical variables, where*n*is the value of the NCAN=*a-option*(specified in the MODEL or PROC TRANSREG statement). Otherwise, the COEFFICIENTS*o-option*includes multiple regression coefficients in the OUT= data set. In addition, when you specify the CLASS expansion for any independent variable, the COEFFICIENTS*o-option*also outputs marginal means. **COORDINATES****COO**-
outputs either ideal point or vector model coordinates for
preference mapping to the OUT= data set.
When METHOD=CANALS, these coordinates are computed from
canonical coefficients; otherwise, the coordinates
are computed from multiple regression coefficients.
For details, see the "Point Models" section.
**CPC**-
outputs canonical point model coordinates to the OUT= data set.
**CQC**-
outputs canonical quadratic point model coordinates to the OUT= data set.
**DAPPROXIMATIONS****DAP**-
outputs the approximations of the transformed dependent variables to the
OUT= data set. These are the target values for the optimal
transformations. With METHOD=UNIVARIATE and METHOD=MORALS, the dependent
variable approximations are the ordinary predicted values from the
linear model. The names of the approximation variables are constructed
from the ADPREFIX=
*o-option*(default A) and the original dependent variable names. For ordinary predicted values, use the PREDICTED*o-option*instead of the DAPPROXIMATIONS*o-option*, since the PREDICTED*o-option*uses a more relevant prefix ("P" instead of "A") and a more relevant variable label suffix ("Predicted Values" instead of "Approximations"). **DESIGN<=***n*>**DES<=***n*>-
specifies that your primary goal is design matrix
coding, not analysis.
Specifying the DESIGN
*o-option*makes the procedure run faster. The DESIGN*o-option*sets the default method to UNIVARIATE and the default MAXITER= value to zero. It suppresses computing the regression coefficients, unless they are needed for some other option. Furthermore, when the DESIGN*o-option*is specified, the MODEL statement is not required to have an equal sign. When no MODEL statement equal sign is specified, all variables are considered independent variables, all options that require dependent variables are ignored, and the IREPLACE*o-option*is implied.

You can use DESIGN=*n*for coding very large data sets, where*n*is the number of observations to code at one time. For example, to code a data set with a large number of observations, you can specify DESIGN=100 or DESIGN=1000 to process the data set in blocks of 100 or 1000 observations. If you specify the DESIGN*o-option*rather than DESIGN=*n*, PROC TRANSREG tries to process all observations at once, which will not work with very large data sets. Specify the NOZEROCONSTANT*a-option*with DESIGN=n to ensure that constant variables within blocks are not zeroed. See the section "Using the DESIGN Output Option" and the section "Choice Experiments: DESIGN, NORESTOREMISSING, NOZEROCONSTANT Usage". **DEPENDENT=***name***DEP=***name*-
specifies the untransformed dependent variable for
OUT= data sets with METHOD=MORALS when there is more than one
dependent variable. The default is DEPENDENT=_DEPEND_.
**DREPLACE****DRE**-
replaces the original dependent variables with the
transformed dependent variables in the OUT= data set.
The names of the transformed variables in the
OUT= data set correspond to the names of the
original dependent variables in the input data set.
By default, both the original dependent variables and
transformed dependent variables (with names constructed
from the TDPREFIX= (default T)
*o-option*and the original dependent variable names) are included in the OUT= data set. **IAPPROXIMATIONS****IAP**-
outputs the approximations of the transformed independent variables to
the OUT= data set. These are the target values for the optimal
transformations. The names of the approximation variables are
constructed from the AIPREFIX=
*o-option*(default A) and the original independent variable names. Specifying the AIPREFIX=*o-option*also implies the IAPPROXIMATIONS*o-option*. The IAPPROXIMATIONS*o-option*is not valid when METHOD=UNIVARIATE. **IREPLACE****IRE**-
replaces the original independent variables with the
transformed independent variables in the OUT= data set.
The names of the transformed variables in the
OUT= data set correspond to the names of the
original independent variables in the input data set.
By default, both the original independent variables and
transformed independent variables (with names constructed
from the TIPREFIX=
*o-option*(default T) and the original independent variable names) are included in the OUT= data set. **LEVERAGE<=***name*>**LEV<=***name*>-
creates a variable with the specified name in the OUT= data set that
contains leverages. Specifying the LEVERAGE
*o-option*is equivalent to specifying LEVERAGE=Leverage. **LILPREFIX=***name***LIL=***name*-
specifies a prefix for naming the liberal-individual-lower confidence
limits. The default prefix is LIL.
Specifying the LILPREFIX=
*o-option*also implies the CLI*o-option*. **LIUPREFIX=***name***LIU=***name*-
specifies a prefix for naming the liberal-individual-upper confidence
limits. The default prefix is LIU.
Specifying the LIUPREFIX=
*o-option*also implies the CLI*o-option*. **LMLPREFIX=***name***LML=***name*-
specifies a prefix for naming the liberal-mean-lower confidence limits.
The default prefix is LML.
Specifying the LMLPREFIX=
*o-option*also implies the CLM*o-option*. **LMUPREFIX=***name***LMU=***name*-
specifies a prefix for naming the liberal-mean-upper confidence limits.
The default prefix is LMU.
Specifying the LMUPREFIX=
*o-option*also implies the CLM*o-option*. **MACRO(***keyword=name...*)**MAC(***keyword=name...*)-
creates macro variables. Most of the options available within the MACRO
*o-option*are rarely needed. By default, the TRANSREG procedure creates a macro variable named _TRGIND with a complete list of independent variables created by the procedure. When the TRANSREG procedure is being used for design matrix creation prior to running a procedure without a CLASS statement, this macro provides a convenient way to use the results from PROC TRANSREG. For example, a PROC LOGISTIC step that uses a design matrix coded by PROC TRANSREG could use the following MODEL statement:model y=&_trgind;

The TRANSREG procedure, also by default, creates a macro variable named _TRGINDN, which contains the number of variables in the _TRGIND list. This macro variable could be used in an ARRAY statement as follows:array indvars[&_trgindn] &_trgind;

See the section "Using the DESIGN Output Option" and the section "Choice Experiments: DESIGN, NORESTOREMISSING, NOZEROCONSTANT Usage" for examples of using the default macro variables.

The available*keywords*are as follows.- DN=
*name* - specifies the name of a macro variable that contains the number of
dependent variables. By default, a macro variable named _TRGDEPN is
created. This is the number of variables in the DL= list and the number
of macro variables created by the DV= and DE= specifications.
- IN=
*name* - specifies the name of a macro variable that contains the number of
independent variables. By default, a macro variable named _TRGINDN is
created. This is the number of variables in the IL= list and the number
of macro variables created by the IV= and IE= specifications.
- DL=
*name* - specifies the name of a macro variable that contains the list of the
dependent variables. By default, a macro variable named _TRGDEP is
created. These are the variable names of the final transformed
variables in the OUT= data set. For example, if there are three
dependent variables, Y1 -Y3, then _TRGDEP contains, by default,
TY1 TY2 TY3 (or Y1 Y2 Y3 if you specify the REPLACE
*o-option*). - IL=
*name* - specifies the name of a macro variable that contains the list of the
independent variables. By default, a macro variable named _TRGIND is
created. These are the variable names of the final transformed
variables in the OUT= data set. For example, if there are three
independent variables, X1 -X3, then _TRGIND contains, by default,
TX1 TX2 TX3
(or X1 X2 X3 if you specify the REPLACE
*o-option*). - DV=
*prefix* - specifies a prefix for creating a list of macro variables, each of
which contains one dependent variable name. For example, if there
are three dependent variables, Y1 -Y3, and you specify MACRO(DV=DEP),
then three macro variables, DEP1, DEP2, and DEP3,
are created,
containing TY1, TY2, and TY3, respectively
(or Y1, Y2, Y3 if you specify the
REPLACE
*o-option*). By default, no list is created. - IV=
*prefix* - specifies a prefix for creating a list of macro variables, each of
which contains one independent variable name. For example, if there
are three independent variables, X1 -X3, and you specify MACRO(IV=IND),
then three macro variables, IND1, IND2, and IND3,
are created,
containing TX1, TX2, and TX3, respectively
(or X1, X2, X3 if you specify the REPLACE
*o-option*). By default, no list is created. - DE=
*prefix* - specifies a prefix for creating a list of macro variables, each of which
contains one dependent variable effect. This list shows the origin of
each model term. Each effect consists of two or more parts, and each
part consists of a value in 32 columns followed by a blank. For example,
if you specify MACRO(DE=D), then a macro variable D1 is created for
IDENTITY(Y). The D1 macro variable is shown below,
wrapped onto two lines.
4 TY IDENTITY Y

The first part is the number of parts (4), the second part is the transformed variable name, the third part is the transformation, and the last part is the input variable name. By default, no list is created. - IE=
*prefix* - specifies a prefix for creating a list of macro variables, each of which
contains one independent variable effect. This list shows the origin of
each model term. Each effect consists of two or more parts, and each part
consists of a value in 32 columns followed by a blank. For example,
if you specify MACRO(ID=I), then three macro
variables, I1, I2, and I3,
are created for CLASS(X1 | X2)
when both X1 and X2 have values of 1
and 2. These macro variables are shown below, but with extra white space
removed.
5 Tx11 CLASS x1 1 5 Tx21 CLASS x2 1 8 Tx11x21 CLASS x1 1 CLASS x2 1

For CLASS variables, the formatted level appears after the variable name. The first two effects are the main effects, and the last is the interaction term. By default, no list is created.

- DN=
**MEANS****MEA**-
outputs marginal means for CLASS variable expansions to the OUT= data
set.
**MEC**-
outputs multiple regression elliptical point model coordinates to the
OUT= data set.
**MPC**-
outputs multiple regression point model coordinates to the OUT= data
set.
**MQC**-
outputs multiple regression quadratic point model coordinates to the
OUT= data set.
**MRC**-
outputs multiple regression coefficients to the OUT= data set.
**MREDUNDANCY****MRE**-
outputs multiple redundancy analysis coefficients to the OUT= data
set.
**NORESTOREMISSING****NORESTORE****NOR**-
specifies that missing values should not be
restored when the OUT= data set is created. By default, the coded
CLASS variable contains a row of missing values for observations in
which the CLASS variable is missing. When you
specify the NORESTOREMISSING
*o-option*, these observations contain a row of zeros instead. This is useful when the TRANSREG procedure is used to code designs for choice models and there is a constant alternative indicated by a missing value. **NOSCORES****NOS**-
excludes original variables, transformed variables, predicted values,
residuals, and scores from the OUT= data set. You can use the
NOSCORES
*o-option*with various other options to create an OUT= data set that contains only a coefficient partition (for example, a data set consisting entirely of coefficients and coordinates). **PREDICTED****PRE****P**-
outputs predicted values, which for METHOD=UNIVARIATE and METHOD=MORALS
are the ordinary predicted values from the linear model, to the OUT=
data set. The names of the predicted values' variables are constructed
from the PPREFIX=
*o-option*(default P) and the original dependent variable names. Specifying the PPREFIX=*o-option*also implies the PREDICTED*o-option*. **PPREFIX=***name***PDPREFIX=***name***PDP=***name*-
specifies a prefix for naming the
dependent variable predicted values.
The default is PPREFIX=P when you specify the
PREDICTED
*o-option*; otherwise, it is PPREFIX=A. Specifying the PPREFIX=*o-option*also implies the PREDICTED*o-option*, and the PPREFIX=*o-option*is the same as the ADPREFIX=*o-option*. **RDPREFIX=***name***RDP=***name*-
specifies a prefix for naming the residual
(dependent) variables to the OUT= data
set. The default is RDPREFIX=R. Specifying
the RDPREFIX=
*o-option*also implies the RESIDUALS*o-option*. **REDUNDANCY<=STANDARDIZE | UNSTANDARDIZE>****RED<=STA | UNS>**-
outputs redundancy variables to the OUT= data set, either standardized
or unstandardized. Specifying the REDUNDANCY
*o-option*is the same as specifying REDUNDANCY=STANDARDIZE. The results of the REDUNDANCY*o-option*depends on the TSTANDARD= option. You must specify TSTANDARD=Z to get results based on standardized data. The TSTANDARD= option controls how the data that go into the redundancy analysis are scaled, and REDUNDANCY=STANDARDIZE|UNSTANDARDIZE controls how the redundancy variables are scaled. The REDUNDANCY*o-option*is implied by METHOD=REDUNDANCY. The RPREFIX=*o-option*specifies a prefix (default Red) for naming the redundancy variables. **REFERENCE=NONE | MISSING | ZERO****REF=NON | MIS | ZER**-
specifies how reference levels of CLASS variables are to be treated.
The options are REFERENCE=NONE, the default, in which reference
levels are suppressed; REFERENCE=MISSING, in which reference levels
are displayed and output with missing values; and REFERENCE=ZERO, in which
reference levels are displayed and output with zeros. The REFERENCE= option can
be specified in the PROC TRANSREG, MODEL, or OUTPUT statement, and it can be
independently specified for the OUT= data set and the displayed
output. When you specify it in only one statement, it sets the option for
both the displayed output and the OUT= data set.
**REPLACE****REP**-
is equivalent to specifying both the DREPLACE and the IREPLACE
*o-options*. **RESIDUALS****RES****R**-
outputs the differences between the transformed dependent variables and
their predicted values. The names of the residual variables are
constructed from the RDPREFIX=
*o-option*(default R) and the original dependent variable names. **RPREFIX=***name***RPR=***name*-
provides a prefix for naming the redundancy
variables. The default is RPREFIX=Red.
Specifying the RPREFIX=
*o-option*also implies the REDUNDANCY*o-option*. **TDPREFIX=***name***TDP=***name*-
specifies a prefix for naming the
transformed dependent variables.
By default, TDPREFIX=T.
The TDPREFIX=
*o-option*is ignored when you specify the DREPLACE*o-option*. **TIPREFIX=***name***TIP=***name*-
specifies a prefix for naming the
transformed independent variables.
By default, TIPREFIX=T.
The TIPREFIX=
*o-option*is ignored when you specify the IREPLACE*o-option*.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.