Chapter Contents Previous Next
 The ORTHOREG Procedure

Longley Data

The labor statistics data set of Longley (1967) is noted for being ill conditioned. Both the ORTHOREG and GLM procedures are applied for comparison (only portions of the PROC GLM results are shown). Note: The results from this example vary from machine to machine, depending on floating-point configuration. The following statements read the data into the SAS data set Longley.

```   title 'PROC ORTHOREG used with Longley data';
data Longley;
input Employment Prices GNP Jobless Military PopSize Year;
datalines;
60323  83.0 234289 2356 1590 107608 1947
61122  88.5 259426 2325 1456 108632 1948
60171  88.2 258054 3682 1616 109773 1949
61187  89.5 284599 3351 1650 110929 1950
63221  96.2 328975 2099 3099 112075 1951
63639  98.1 346999 1932 3594 113270 1952
64989  99.0 365385 1870 3547 115094 1953
63761 100.0 363112 3578 3350 116219 1954
66019 101.2 397469 2904 3048 117388 1955
67857 104.6 419180 2822 2857 118734 1956
68169 108.4 442769 2936 2798 120445 1957
66513 110.8 444546 4681 2637 121950 1958
68655 112.6 482704 3813 2552 123366 1959
69564 114.2 502601 3931 2514 125368 1960
69331 115.7 518173 4806 2572 127852 1961
70551 116.9 554894 4007 2827 130081 1962
;
run;
```

The data set contains one dependent variable, Employment (total derived employment) and six independent variables: Prices (GNP implicit price deflator with year 1954 = 100), GNP (gross national product), Jobless (unemployment), Military (size of armed forces), PopSize (non-institutional population aged 14 and over), and Year (year).

The following statements use the ORTHOREG procedure to model the Longley data using a quadratic model in each independent variable, without interaction:

```   proc orthoreg data=Longley;
model Employment = Prices   Prices*Prices
GNP      GNP*GNP
Jobless  Jobless*Jobless
Military Military*Military
PopSize  PopSize*PopSize
Year     Year*Year;
run;
```

Figure 48.1 shows the resulting analysis.

 PROC ORTHOREG used with Longley data

 ORTHOREG Regression Procedure Dependent Variable: Employment

 Source DF Sum of Squares Mean Square F Value Pr > F Model 12 184864508.5 15405375.709 320.24 0.0003 Error 3 144317.49568 48105.831895 Corrected Total 15 185008826

 Root MSE 219.33 R-Square 0.99922

 Parameter DF Parameter Estimate Standard Error t Value Pr > |t| Intercept 1 186931078.640216 154201839.66 1.21 0.3122 Prices 1 1324.50679362506 916.17455832 1.45 0.2440 Prices**2 1 -6.61923922845539 4.7891445654 -1.38 0.2609 GNP 1 -0.12768642156232 0.0738897784 -1.73 0.1824 GNP**2 1 3.1369569286212E-8 8.7167753E-8 0.36 0.7428 Jobless 1 -4.35507653558708 1.3851792402 -3.14 0.0515 Jobless**2 1 0.00022132944101 0.0001763541 1.26 0.2983 Military 1 4.91162014560828 1.826715856 2.69 0.0745 Military**2 1 -0.00113707146734 0.0003539971 -3.21 0.0489 PopSize 1 -0.0303997234299 5.9272538242 -0.01 0.9962 PopSize**2 1 -1.212511414607E-6 0.0000237262 -0.05 0.9625 Year 1 -194907.139041839 157739.28757 -1.24 0.3045 Year**2 1 50.8067603538501 40.279878943 1.26 0.2963

Figure 48.1: PROC ORTHOREG Results

The estimates in Figure 48.1 compare very well with the best estimates available; for additional information, refer to Longley (1967) and Beaton, Rubin, and Barone (1976).

The following statements request the same analysis from the GLM procedure:

```   ods select OverallANOVA
FitStatistics
ParameterEstimates;
proc glm data=Longley;
model Employment = Prices   Prices*Prices
GNP      GNP*GNP
Jobless  Jobless*Jobless
Military Military*Military
PopSize  PopSize*PopSize
Year     Year*Year;
run;
```

Figure 48.2 contains the over-all ANOVA table and the parameter estimates produced by PROC GLM. Notice that the ORTHOREG fit achieves a somewhat smaller root mean square error (RMSE) and also that the GLM procedure detects spurious singularities.

 PROC ORTHOREG used with Longley data

 The GLM Procedure Dependent Variable: Employment

 Source DF Sum of Squares Mean Square F Value Pr > F Model 11 184791061.6 16799187.4 308.58 <.0001 Error 4 217764.4 54441.1 Corrected Total 15 185008826.0

 R-Square Coeff Var Root MSE Employment Mean 0.998823 0.357221 233.3262 65317.00

 Parameter Estimate Standard Error t Value Pr > |t| Intercept -3598851.899 B 1327335.652 -2.71 0.0535 Prices 523.802 688.979 0.76 0.4894 Prices*Prices -2.326 3.507 -0.66 0.5434 GNP -0.138 0.078 -1.76 0.1526 GNP*GNP 0.000 0.000 0.24 0.8218 Jobless -4.599 1.459 -3.15 0.0344 Jobless*Jobless 0.000 0.000 1.14 0.3183 Military 4.994 1.942 2.57 0.0619 Military*Military -0.001 0.000 -3.15 0.0346 PopSize -4.246 5.156 -0.82 0.4565 PopSize*PopSize 0.000 B 0.000 0.81 0.4655 Year 0.000 B . . . Year*Year 1.038 0.419 2.48 0.0683

 NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

Figure 48.2: Partial PROC GLM Results

 Chapter Contents Previous Next Top