|
Chapter Contents |
Previous |
Next |
| The LOGISTIC Procedure |
The accuracy of the classification is measured by its sensitivity (the ability to predict an event correctly) and specificity (the ability to predict a nonevent correctly). Sensitivity is the proportion of event responses that were predicted to be events. Specificity is the proportion of nonevent responses that were predicted to be nonevents. PROC LOGISTIC also computes three other conditional probabilities: false positive rate, false negative rate, and rate of correct classification. The false positive rate is the proportion of predicted event responses that were observed as nonevents. The false negative rate is the proportion of predicted nonevent responses that were observed as events. Given prior probabilities specified with the PEVENT= option, these conditional probabilities can be computed as posterior probabilities using Bayes' theorem.
When you classify a set of binary data, if the same observations used
to fit the model are also used to estimate the classification error,
the resulting error-count estimate is biased. One way of reducing the bias is
to remove the binary observation to be classified
from the data, reestimate the parameters of the model, and
then classify the observation based on the new parameter estimates.
However, it would
be costly to fit the model for each observation. The
LOGISTIC procedure provides a less expensive one-step approximation
to the preceding parameter estimates. Let b be the MLE of the
parameter vector
based on all observations.
Let bj denote the MLE
computed without the
jth observation. The one-step estimate of
bj is given by

where
Let B denote the event that a subject has the disease and
denote the event of not having the disease.
Let A denote the event that the subject responds
positively, and let
denote the event of responding
negatively.
Results of the classification
are represented by two
conditional probabilities,
and
, where
is the sensitivity, and
is one minus
the specificity. These probabilities are given by

Bayes' theorem is used to compute the error rates of the classification. For a given prior probability Pr(B) of the disease, the false positive rate PF+ and the false negative rate PF- are given by Fleiss (1981, pp. 4 -5) as follows:
![P_{F+} = {\rm Pr}({\bar{B}}| A) & = & \frac{{\rm Pr}(A|{\bar{B}})[1-{\rm Pr}(B)]...
... {1-{\rm Pr}(A|{\bar{B}}) - {\rm Pr}(B)[{\rm Pr}(A| B) - {\rm Pr}(A|{\bar{B}})]}](images/lgseq165.gif)

|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.