Chapter Contents |
Previous |
Next |

The PROBIT Procedure |

**PROC PROBIT***< options >***;**

The PROC PROBIT statement starts the procedure. You can specify the following options in the PROC PROBIT statement.

**COVOUT**-
writes the parameter estimate covariance matrix to the OUTEST= data set.
**C=***rate***OPTC**-
controls how the natural response is handled.
Specify the OPTC option to request that the
natural response rate
*C*be estimated. Specify the C=*rate*option to set the natural response rate or to provide the initial estimate of the natural response rate. The natural response rate value must be a number between 0 and 1.- If you specify neither the OPTC nor the C= option, a natural response rate of zero is assumed.
- If you specify both the OPTC and the C= option, the C= option should be a reasonable initial estimate of the natural response rate. For example, you could use the ratio of the number of responses to the number of subjects in a control group.
- If you specify the C= option but not the OPTC option, the natural response rate is set to the specified value and not estimated.
- If you specify the OPTC option but not the C= option,
PROC PROBIT's action
depends on the response variable, as follows:
- If you specify either the LN or LOG10 option and some
subjects have the first independent variable
(dose) values less than or equal to zero,
these subjects are treated as a control group.
The initial estimate of
*C*is then the ratio of the number of responses to the number of subjects in this group. - If you do not specify the LN or LOG10 option
or if there is no control group, then one of the following
occurs:
- If all responses are greater than zero, the initial estimate of the natural response rate is the minimal response rate ( the ratio of the number of responses to the number of subjects in a dose group) across all dose levels.
- If one or more of the responses is zero (making the response rate zero in that dose group), the initial estimate of the natural rate is the reciprocal of twice the largest number of subjects in any dose group in the experiment.

- If you specify either the LN or LOG10 option and some
subjects have the first independent variable
(dose) values less than or equal to zero,
these subjects are treated as a control group.
The initial estimate of

**DATA=***SAS-data-set*-
names the SAS data set to be used by PROC PROBIT.
By default, the procedure uses the
most recently created SAS data set.
**HPROB=***p*-
specifies a minimum probability level
for the Pearson chi-square
to indicate a good fit. The default value is 0.10.
The LACKFIT option must also be specified
for this option to have any effect.
For Pearson goodness of fit chi-square values with
probability greater than the HPROB= value, the fiducial
limits, if requested with the INVERSECL option,
are computed using a critical value of 1.96.
For chi-square values with probability less than the value of the
HPROB= option, the critical value is a 0.95 two-sided quantile
value taken from the
*t*distribution with degrees of freedom equal to (*k*- 1) ×*m*-*q*, where*k*is the number of levels for the response variable,*m*is the number of different sets of independent variable values, and*q*is the number of parameters fit in the model. Note that the HPROB= option can also appear in the MODEL statement. **INVERSECL**-
computes confidence limits for the values of the
first continuous independent variable (such as
dose) that yield selected response rates.
If the algorithm fails to converge (this can happen when
*C*is nonzero), missing values are reported for the confidence limits. See the section "Inverse Confidence Limits" for details. Note that the INVERSECL option can also appear in the MODEL statement. **LACKFIT**-
performs two goodness-of-fit tests (a Pearson chi-square test
and a log-likelihood ratio chi-square test) for the fitted model.
**Note:**The data set must be sorted by the independent variables before the PROBIT procedure is run if you want to perform a test of fit. This test is not appropriate if the data are very sparse, with only a few values at each set of the independent variable values.

If the Pearson chi-square test statistic is significant, then the covariance estimates and standard error estimates are adjusted. See the "Lack of Fit Tests" section for a description of the tests. Note that the LACKFIT option can also appear in the MODEL statement. **LOG****LN**-
analyzes the data by replacing the first continuous
independent variable by its natural logarithm.
This variable is usually the level of some treatment such as dosage.
In addition to the usual output given by the INVERSECL
option, the estimated dose values and 95% fiducial
limits for dose are also displayed.
If you specify the OPTC option, any observations with a dose value less than
or equal to zero are used in the estimation as a control group.
If you do not specify the OPTC option with the LOG or LN option,
then any observations
with the first continuous independent variable values less than or equal
to zero are ignored.
**LOG10**-
specifies an analysis like that of the LN or LOG option except
that the common logarithm (log to the base 10) of the
dose value is used rather than the natural logarithm.
**NOPRINT**-
suppresses the display of all output. Note that this option
temporarily disables the Output Delivery System (ODS).
For more information, see Chapter 15, "Using the Output Delivery System."
**OPTC**-
controls how the natural response is handled.
See the description of the C= option for details.
**ORDER=DATA | FORMATTED | FREQ | INTERNAL**-
specifies the sorting order for the levels of the
classification variables specified in the CLASS statement,
including the levels of the response variable.
Response level ordering is important since PROC PROBIT always
models the probability of response levels at the beginning of the
ordering. See the section "Response Level Ordering" for further details.
This ordering also determines which parameters in
the model correspond to each level in the data.
The following table shows how PROC PROBIT
interprets values of the ORDER= option.
**Value of ORDER=****Levels Sorted By**DATA order of appearance in the input data set FORMATTED formatted value FREQ descending frequency count; levels with the most observations come first in the order INTERNAL unformatted value

By default, ORDER=FORMATTED. For the values FORMATTED and INTERNAL, the sort order is machine dependent. For more information on sorting order, see the chapter on the SORT procedure in the*SAS Procedures Guide*. **OUTEST= SAS-data-set**-
specifies a SAS data set to contain the parameter
estimates and, if the COVOUT option is specified, their estimated covariances.
If you omit this option,
the output data set is not created.
The contents of the data set are
described in
the section "OUTEST= Data Set".
This data set is not created if class variables are used.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.