Chapter Contents |
Previous |
Next |

The GLM Procedure |

The following example illustrates the calculation. Suppose A has 20 levels, B has 4 levels, and C has 3 levels. Then consider the model

proc glm; class A B C; model Y1 Y2 Y3=A B A*B C A*C B*C A*B*C X1 X2; run;

The **X'X** matrix (bordered by **X'Y**
and **Y'Y**) can have as many as 425 rows and columns:

- 1
- for the intercept term
- 20
- for A
- 4
- for B
- 80
- for A*B
- 3
- for C
- 60
- for A*C
- 12
- for B*C
- 240
- for A*B*C
- 2
- for X1 and X2 (continuous variables)
- 3
- for Y1, Y2, and Y3 (dependent variables)

The matrix has 425 rows and columns only if all combinations of levels occur for each effect in the model. For

The required memory grows as the square of the number of columns of

The second time that a large amount of memory is needed is when Type III, Type IV, or contrast sums of squares are being calculated. This memory requirement is a function of the number of degrees of freedom of the model being analyzed and the maximum degrees of freedom for any single source. Let Rank equal the sum of the model degrees of freedom, MaxDF be the maximum number of degrees of freedom for any single source, and

Unfortunately, these quantities are not available when the
**X'X** matrix is being constructed, so PROC GLM may
occasionally request additional memory even after you have
increased the memory allocation available to the program.

If you have a large model that exceeds the memory capacity of your computer, these are your options:

- Eliminate terms, especially high-level interactions.
- Reduce the number of levels for variables with many levels.
- Use the ABSORB statement for parts of the model that are large.
- Use the REPEATED statement for repeated measures variables.
- Use PROC ANOVA or PROC REG rather than PROC GLM, if your design allows.

The time required for collecting sums and
cross products is difficult to calculate because
it is a complicated function of the model.
For a model with *m* columns and *n* rows (observations) in
**X**, the worst case occurs if all columns are continuous
variables, involving *nm ^{2}*/2 multiplications and additions.
If the columns are levels of a classification,
then only

Suppose you know that Type IV sums of squares are appropriate for the model you are analyzing (for example, if your design has no missing cells). You can specify the SS4 option in your MODEL statement, which saves CPU time by requesting the Type IV sums of squares instead of the more computationally burdensome Type III sums of squares. This proves especially useful if you have a factor in your model that has many levels and is involved in several interactions.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.