Chapter Contents |
Previous |
Next |

The MIXED Procedure |

This section provides an overview of a likelihood-based approach to general linear mixed models. This approach simplifies and unifies many common statistical analyses, including those involving repeated measures, random effects, and random coefficients. The basic assumption is that the data are linearly related to unobserved multivariate normal random variables. Extensions to nonlinear and nonnormal situations are possible but are not discussed here. Additional theory with examples is provided in Littell et al. (1996) and Verbeke and Molenberghs (1997).

The preceding equations can be written simultaneously using vectors and a matrix, as follows:

In addition to denoting data, random variables, and explanatory
variables in the preceding fashion, the subsequent development makes
use of basic matrix operators such as transpose ('), inverse
(^{-1}), generalized inverse (^{-}), determinant (),
and matrix multiplication. Refer to Searle (1982) for details on
these and other matrix techniques.

A key assumption in the foregoing analysis is that and are normally distributed with

Note that this is a general specification of the mixed model, in
contrast to many texts and articles that discuss only simple random
effects. Simple random effects are a special case of the general
specification with **Z** containing dummy variables, **G**
containing variance components in a diagonal structure, and , where **I**_{n} denotes the *n* ×*n* identity
matrix. The general linear model is a further special case with
**Z** = **0** and .

The following two examples illustrate the most common formulations of the general linear mixed model.

Suppose further that you want to introduce a common correlation
among the observations from a single individual, with correlation
being the same for all individuals. One way of setting this up in
the general mixed model is to eliminate the **Z** and **G** matrices
and let the **R** matrix be block diagonal with blocks corresponding
to the individuals and with each block having the *compound-symmetry*
structure. This structure has two unknown parameters, one modeling
a common covariance and the other a residual variance.
The form for **R** would then be as follows:

The PROC MIXED code to fit this model is as follows:

proc mixed; class indiv; model y = time; repeated / type=cs subject=indiv; run;

Here, indiv is a classification variable indexing individuals. The
MODEL statement fits a straight line for time; the intercept is fit by
default just as in PROC GLM. The REPEATED statement models the **R**
matrix: TYPE=CS specifies the compound symmetry structure, and
SUBJECT=INDIV specifies the blocks of **R**.

An alternative way of specifying the common intra-individual correlation is to let

You can set up this model in PROC MIXED in two different but equivalent ways:

proc mixed; class indiv; model y = time; random indiv; run; proc mixed; class indiv; model y = time; random intercept / subject=indiv; run;

Both of these specifications fit the same model as the previous one that used the REPEATED statement; however, the RANDOM specifications constrain the correlation to be positive whereas the REPEATED specification leaves the correlation unconstrained.

Such an arrangement is often necessary by nature of the experiment, the classical example being the application of fertilizer to large plots of land and different crop varieties planted in subdivisions of the large plots. For this example, fertilizer is the whole plot factor A and variety is the subplot factor B.

The first example is a split-plot design for which the whole plots are arranged in a randomized block design. The appropriate PROC MIXED code is as follows:

proc mixed; class a b block; model y = a|b; random block a*block; run;

Here

random int a / subject=block;

fits the same model, but with **Z** and **G** sorted differently.

In many situations, the best approach is to use *
likelihood-based* methods, exploiting the assumption that and are normally distributed (Hartley and Rao 1967;
Patterson and Thompson 1971; Harville 1977; Laird and Ware 1982;
Jennrich and Schluchter 1986). PROC MIXED implements
two likelihood-based methods: *maximum likelihood* (ML) and *
restricted/residual maximum likelihood* (REML). A favorable
theoretical property of ML and REML is that they accommodate data
that are missing at random (Rubin 1976; Little 1995).
PROC MIXED constructs an objective function associated with ML or
REML and maximizes it over all unknown parameters. Using calculus,
it is possible to reduce this maximization problem to one over only
the parameters in **G** and **R**. The corresponding log-likelihood
functions are as follows:

One advantage of using the Newton-Raphson algorithm is that the
second derivative matrix of the objective function evaluated at the
optima is available upon completion. Denoting this matrix **H**, the asymptotic theory of maximum likelihood (refer to Serfling
1980) shows that 2**H**^{-1} is an asymptotic
variance-covariance matrix of the estimated parameters of **G** and
**R**. Thus, tests and confidence intervals based on asymptotic
normality can be obtained. However, these can be unreliable in
small samples, especially for parameters such as variance components
which have sampling distributions that tend to be skewed to the
right.

If a residual variance is a part of your mixed model, it
can usually be *profiled* out of the likelihood. This means
solving analytically for the optimal and plugging this
expression back into the likelihood formula (refer to Wolfinger,
Tobias, and Sall 1994). This reduces the number of optimization
parameters by one and can improve convergence properties. PROC
MIXED profiles the residual variance out of the log likelihood
whenever it appears reasonable to do so. This includes the case
when **R** equals and when it has blocks with a
compound symmetry, time series, or spatial structure. PROC MIXED
does not profile the log likelihood when **R** has unstructured

blocks, when you use the HOLD= or NOITER option in the PARMS statement, or when you use the NOPROFILE option in the PROC MIXED statement.

Instead of ML or REML, you can use the noniterative MIVQUE0 method
to estimate **G** and **R** (Rao 1972; LaMotte 1973; Wolfinger,
Tobias, and Sall 1994). In fact, by default PROC MIXED uses MIVQUE0
estimates as starting values for the ML and REML procedures. For
variance component models, another estimation method involves
equating Type I, II, or III expected mean squares to their observed
values and solving the resulting system. However, Swallow and
Monahan (1984) present simulation evidence favoring REML and ML over
MIVQUE0 and other method-of-moment estimators.

The solutions can also be written as

Note that the mixed model equations are extended normal equations and that the preceding expression assumes that is nonsingular. For the extreme case when the eigenvalues of are very large, contributes very little to the equations and is close to what it would be if actually contained fixed-effects parameters. On the other hand, when the eigenvalues of are very small, dominates the equations and is close to 0. For intermediate cases, can be viewed as shrinking the fixed-effects estimates of towards 0 (Robinson 1991).

If is singular, then the mixed model equations are modified (Henderson 1984) as follows:

where is the lower-triangular Cholesky root of , satisfying . Both and a generalized inverse of the left-hand-side coefficient matrix are then transformed using to determine .

An example of when the singular form of the equations is necessary is when a variance component estimate falls on the boundary constraint of 0.

Space does not permit a thorough discussion of model selection, but a few brief comments and references are in order. First, subject matter considerations and objectives are of great importance when selecting a model; refer to Diggle (1988) and Lindsey (1993).

Second, when the data themselves are looked to for guidance, many of the graphical methods and diagnostics appropriate for the general linear model extend to the mixed model setting as well (Christensen, Pearson, and Johnson 1992).

Finally, a likelihood-based approach to the mixed model provides several statistical measures for model adequacy as well. The most common of these are the likelihood ratio test and Akaike's and Schwarz's criteria (Bozdogan 1987; Wolfinger 1993).

However, **G** and **R** are usually unknown and are estimated using
one of the aforementioned methods. These estimates, and , are therefore simply substituted into the
preceding expression to obtain

As a cautionary note, tends to underestimate the true
sampling variability of () because no account is made for the uncertainty in
estimating **G** and **R**. Although inflation factors have been
proposed (Kackar and Harville 1984; Kass and Steffey 1989; Prasad and
Rao 1990), they tend to be small for data sets that are fairly well
balanced. PROC MIXED does not compute any inflation factors by
default, but rather accounts for the downward bias by using the
approximate *t* and *F* statistics described subsequently. The
DDFM=KENWARDROGER option in the MODEL statement prompts PROC MIXED to
compute a specific inflation factor along with Satterthwaite-based
degrees of freedom.

A better alternative is the likelihood ratio . This statistic compares two covariance models, one a special case of the other. To compute it, you must run PROC MIXED twice, once for each of the two models, and then subtract the corresponding values of -2 times the log likelihoods. You can use either ML or REML to construct this statistic, which tests whether the full model is necessary beyond the reduced model.

As long as the reduced model does not occur on the boundary of the covariance parameter space, the statistic computed in this fashion has a large-sample sampling distribution that is with degrees of freedom equal to the difference in the number of covariance parameters between the two models. If the reduced model does occur on the boundary of the covariance parameter space, the asymptotic distribution becomes a mixture of distributions (Self and Liang 1987). A common example of this is when you are testing that a variance component equals its lower boundary constraint of 0.

A final possibility for obtaining inferences concerning the covariance parameters is to simulate or resample data from your model and construct empirical sampling distributions of the parameters. The SAS macro language and the ODS system are useful tools in this regard.

For inferences concerning the fixed- and random-effects parameters in the mixed model, consider estimable linear combinations of the following form:

Statistical inferences are obtained by testing the hypothesis

When **L** consists of a single row, a general *t*-statistic can
be constructed as follows (refer to McLean and Sanders 1988, Stroup 1989a):

When the rank of **L** is greater than 1, PROC MIXED constructs the
following general *F*-statistic:

The *t*- and *F*-statistics enable you to make inferences about your
fixed effects, which account for the variance-covariance model you
select. An alternative is the statistic associated with
the likelihood ratio test. This statistic compares two
fixed-effects models, one a special case of the other. It is
computed just as when comparing different covariance models,
although you should use ML and not REML here because the penalty
term associated with restricted likelihoods depends upon the
fixed-effects specification.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.