Chapter Contents |
Previous |
Next |

Multiple Regression |

The **GPA** data set contains data collected to determine
which applicants at a large midwestern university were
likely to succeed in its computer science program.
The variable **GPA** is the measure of success of students in
the computer science program, and it is the response variable.
A *response variable* measures the outcome
to be explained or predicted.
Several other variables are also included in the study
as possible explanatory variables or predictors of **GPA**.
An *explanatory variable* may explain
variation in the response variable.
Explanatory variables for this example include average
high school grades in mathematics (**HSM**), English (**HSE**),
and science (**HSS**) (Moore and McCabe 1989).
To begin the regression analysis, follow these steps.

Open the GPA data set. |

Choose Analyze:Fit (Y X). |

The fit variables dialog appears, as shown in
Figure 14.3. This dialog differs from all
other variables dialogs because it can remain visible
even after you create the fit window.
This makes it convenient to add and remove variables
from the model.
To make the variables dialog stay on the display,
click on the **Apply** button when you are finished
specifying the model.
Each time you modify the model and use the **Apply** button, a
new fit window appears so you can easily compare models.
Clicking on **OK** also displays a new fit
window but closes the dialog.

Select the variable GPA in the list on the left, then click the Y button. |

**GPA** appears in the **Y** variables list.

Select the variables HSM, HSS, and HSE, then click the X button. |

**HSM**, **HSS**, and **HSE** appear in the **X** variables list.

Click the Apply button. |

A fit window appears, as shown in Figure 14.5.

This window shows the results of a regression
analysis of **GPA** on **HSM, HSS**, and **HSE**.
The regression model for the *i*th
observation can be written as

where GPA_{i} is the value of GPA;
to are the regression coefficients (parameters);
HSM_{i}, HSS_{i},
and HSE_{i} are the values of the
explanatory variables; and is
the random error term. The 's
are assumed to be uncorrelated, with mean 0 and variance
.

By default, the fit window displays tables for
model information, **Model Equation**, **Summary of Fit**,
**Analysis of Variance**, **Type III Tests**, and
**Parameter Estimates**, and a residual-by-predicted plot,
as illustrated in Figure 14.5.
You can display other tables and graphs by clicking on
the **Output** button on the fit variables dialog or by
choosing menus as described in the section "Adding Tables and Graphs"
later in this chapter.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.