Chapter Contents |
Previous |
Next |

The GLM Procedure |

PROC GLM provides both univariate and multivariate tests for repeated measures for one response. For an overall reference on univariate repeated measures, refer to Winer (1971). The multivariate approach is covered in Cole and Grizzle (1966). For a discussion of the relative merits of the two approaches, see LaTour and Miniard (1983).

Another approach to analysis of repeated measures is via general mixed models. This approach can handle balanced as well as unbalanced or missing within-subject data, and it offers more options for modeling the within-subject covariance. The main drawback of the mixed models approach is that it generally requires iteration and, thus, may be less computationally efficient. For further details on this approach, see Chapter 41, "The MIXED Procedure," and Wolfinger and Chang (1995).

SUBJ GROUP TIME Y 1 1 1 15 1 1 2 19 1 1 3 25 2 1 1 21 2 1 2 18 2 1 3 17 1 2 1 14 1 2 2 12 1 2 3 16 2 2 1 11 2 2 2 20 . . . 10 3 1 14 10 3 2 18 10 3 3 16

There are three observations for each subject, corresponding to measurements taken at times 1, 2, and 3. These data could be analyzed using the following statements:

proc glm data=old; class group subj time; model y=group subj(group) time group*time; test h=group e=subj(group); run;

However, this analysis assumes subjects' measurements are uncorrelated across time. A repeated measures analysis does not make this assumption. It uses a data set new:

GROUP Y1 Y2 Y3 1 15 19 25 1 21 18 17 2 14 12 16 2 11 20 21 . . . 3 14 18 16

In the data set new, the three measurements for a subject are all in one observation. For example, the measurements for subject 1 for times 1, 2, and 3 are 15, 19, and 25. For these data, the statements for a repeated measures analysis (assuming default options) are

proc glm data=new; class group; model y1-y3=group / nouni; repeated time; run;

To convert the univariate form of repeated measures data to the multivariate form, you can use a program like the following:

proc sort data=old; by group subj; run; data new(keep=y1-y3 group); array yy(3) y1-y3; do time=1 to 3; set old; by group subj; yy(time)=y; if last.subj then return; end; run;

Alternatively, you could use PROC TRANSPOSE to achieve the same results with a program like this one:

proc sort data=old; by group subj; run; proc transpose out=new(rename=(_1=y1 _2=y2 _3=y3)); by group subj; id time; run;

Refer to the discussions in *SAS Language Reference: Concepts*
for more information on rearrangement of data sets.

- between-subject effects (such as GROUP in the previous example)
- within-subject effects (such as TIME in the previous example)
- interactions between the two types of effects (such as GROUP*TIME in the previous example)

Repeated measures analyses are distinguished from MANOVA because of interest in testing hypotheses about the within-subject effects and the within-subject-by-between-subject interactions.

For tests that involve only between-subjects effects, both the multivariate and univariate approaches give rise to the same tests. These tests are provided for all effects in the MODEL statement, as well as for any CONTRASTs specified. The ANOVA table for these tests is labeled "Tests of Hypotheses for Between Subjects Effects" on the PROC GLM results. These tests are constructed by first adding together the dependent variables in the model. Then an analysis of variance is performed on the sum divided by the square root of the number of dependent variables. For example, the statements

model y1-y3=group; repeated time;

give a one-way analysis of variance using as the dependent variable for performing tests of
hypothesis on the between-subject effect GROUP.
Tests for between-subject effects are equivalent to tests of the hypothesis
, where **M** is simply a vector of 1s.

For within-subject effects and for within-subject-by-between-subject interaction effects, the univariate and multivariate approaches yield different tests. These tests are provided for the within-subject effects and for the interactions between these effects and the other effects in the MODEL statement, as well as for any CONTRASTs specified. The univariate tests are displayed in a table labeled "Univariate Tests of Hypotheses for Within Subject Effects." Results for multivariate tests are displayed in a table labeled "Repeated Measures Analysis of Variance."

The multivariate tests provided for within-subjects effects and interactions involving these effects are Wilks' Lambda, Pillai's Trace, Hotelling-Lawley Trace, and Roy's maximum root. For further details on these four statistics, see the "Multivariate Tests" section in Chapter 3, "Introduction to Regression Procedures." As an example, the statements

model y1-y3=group; repeated time;

produce multivariate tests for the within-subject effect TIME and the interaction TIME*GROUP.

The multivariate tests for within-subject effects are produced by testing
the hypothesis , where the **L** matrix
is the usual matrix corresponding to Type I, Type II, Type III,
or Type IV hypotheses tests, and the **M** matrix is one of
several matrices depending on the transformation that you specify in the REPEATED statement.
The only assumption required for valid tests is
that the dependent variables in the model have a
multivariate normal distribution with a common
covariance matrix across the between-subject effects.

The univariate tests for within-subject effects and interactions
involving these effects require some assumptions for the
probabilities provided by the ordinary *F*-tests to be correct.
Specifically, these tests require certain patterns of covariance
matrices, known as Type H covariances (Huynh and Feldt 1970).
Data with these patterns in the covariance matrices
are said to satisfy the Huynh-Feldt condition.
You can test this assumption (and the Huynh-Feldt condition)
by applying a sphericity test (Anderson 1958) to any set of
variables defined by an orthogonal contrast transformation.
Such a set of variables is known
as a set of orthogonal components.
When you use the PRINTE option in the REPEATED statement, this
sphericity test is applied both to the transformed variables
defined by the REPEATED statement and to a set of orthogonal
components if the specified transformation is not orthogonal.
It is the test applied to the orthogonal
components that is important in determining
whether your data have Type H covariance structure.
When there are only two levels of the within-subject
effect, there is only one transformed variable, and a
sphericity test is not needed.
The sphericity test is labeled
"Test for Sphericity" on the output.

If your data satisfy the preceding assumptions, use the
usual *F*-tests to test univariate hypotheses for the
within-subject effects and associated interactions.

If your data do not satisfy the assumption of
Type H covariance, an adjustment to numerator
and denominator degrees of freedom can be used.
Two such adjustments, based on a degrees
of freedom adjustment factor known as (epsilon) (Box 1954), are provided in PROC GLM.
Both adjustments estimate and then multiply the
numerator and denominator degrees of freedom by this estimate
before determining significance levels for the *F*-tests.
Significance levels associated with the adjusted tests
are labeled "Adj Pr > F" on the output.
The first adjustment, initially proposed for use in data
analysis by Greenhouse and Geisser (1959), is labeled
"Greenhouse-Geisser Epsilon" and represents the
maximum-likelihood estimate of Box's factor.
Significance levels associated with adjusted
*F*-tests are labeled "G-G" on the output.
Huynh and Feldt (1976) have shown that the G-G estimate
tends to be biased downward (that is, too conservative),
especially for small samples, and they have proposed an
alternative estimator that is constructed using unbiased
estimators of the numerator and denominator of Box's .Huynh and Feldt's estimator is labeled "Huynh-Feldt
Epsilon" on the PROC GLM output, and the significance levels
associated with adjusted *F*-tests are labeled "H-F."
Although must be in the range of 0 to
1, the H-F estimator can be outside this range.
When the H-F estimator is greater than 1, a value
of 1 is used in all calculations for probabilities,
and the H-F probabilities are not adjusted.
In summary, if your data do not meet
the assumptions, use adjusted *F*-tests.
However, when you strongly suspect that your data may not have
Type H covariance, all these
univariate tests should be interpreted cautiously.
In such cases, you should consider using the multivariate
tests instead.

The univariate sums of squares for hypotheses involving
within-subject effects can be easily calculated from
the **H** and **E** matrices corresponding to
the multivariate tests described in the "Multivariate Analysis of Variance" section.
If the **M** matrix is orthogonal, the univariate sums of
squares is calculated as the trace (sum of diagonal elements)
of the appropriate **H** matrix; if it is not orthogonal,
PROC GLM calculates the trace of the **H** matrix that
results from an orthogonal **M** matrix transformation.
The appropriate error term for the univariate *F*-tests is
constructed in a similar way from the error SSCP matrix and is
labeled Error(*factorname*), where *factorname* indicates
the **M** matrix that is used in the transformation.

When the design specifies more than one repeated measures
factor, PROC GLM computes the **M** matrix for a given effect
as the direct (Kronecker) product of the **M** matrices
defined by the REPEATED statement if the factor is involved
in the effect or as a vector of 1s if the factor is not involved.
The test for the main effect of a repeated-measures factor
is constructed using an **L** matrix that corresponds
to a test that the mean of the observation is zero.
Thus, the main effect test for repeated measures is a test
that the means of the variables defined by the **M**
matrix are all equal to zero, while interactions involving
repeated-measures effects are tests that the between-subjects
factors involved in the interaction have no effect on the means
of the transformed variables defined by the **M** matrix.
In addition, you can specify other **L** matrices to
test hypotheses of interest by using the CONTRAST statement,
since hypotheses defined by CONTRAST statements are also
tested in the REPEATED analysis.
To see which combinations of the original variables
the transformed variables represent, you can specify
the PRINTM option in the REPEATED statement.
This option displays the transpose of **M**,
which is labeled as M in the PROC GLM results.
The tests produced are the same for any choice of transformation
(**M**) matrix specified in the REPEATED statement;
however, depending on the nature of the repeated measurements
being studied, a particular choice of transformation matrix,
coupled with the CANONICAL or SUMMARY option, can provide
additional insight into the data being studied.

The following sections describe the transformations
available in the REPEATED statement, provide an
example of the **M** matrix that is produced,
and give guidelines for the use of the transformation.
As in the PROC GLM output, the displayed matrix is labeled M.
This is the **M**' matrix.

proc glm; model d1-d5= / nouni; repeated drug 5 contrast(1) / summary printm; run;

produce the following **M** matrix:

When you examine the analysis of variance tables produced by the SUMMARY option, you can tell which of the drugs differed significantly from the placebo.

proc glm; class group; model r1-r5=group / nouni; repeated dose 5 (1 2 5 10 20) polynomial / summary printm; run;

produce the following **M** matrix.

The SUMMARY option in this example provides
univariate ANOVAs for the variables defined
by the rows of this **M** matrix.
In this case, they represent the linear, quadratic,
cubic, and quartic trends for dose and are labeled
dose_1, dose_2, dose_3, and dose_4, respectively.

proc glm; class sex; model resp1-resp4=sex / nouni; repeated trtmnt 4 helmert / canon printm; run;

produce the following **M** matrix:

repeated drug 5 mean / printm;

the following **M** matrix is produced:

As with the CONTRAST transformation, if you want to omit a level other than the last, you can specify it in parentheses after the keyword MEAN in the REPEATED statement.

proc glm; class school; model t1-t4=school / nouni; repeated method 4 profile / summary nom printm; run;

produce the following **M** matrix:

To determine the point at which an improvement in test scores takes place, you can examine the analyses of variance for the transformed variables representing the differences between adjacent tests. These analyses are requested by the SUMMARY option in the REPEATED statement, and the variables are labeled METHOD.1, METHOD.2, and METHOD.3.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.