Chapter Contents
Chapter Contents
The CATMOD Procedure

Computational Formulas

The following calculations are shown for each population and then for all populations combined.

Source Formula Dimension
Probability Estimates
jth responsepij = [(nij)/(ni)]1 ×1
ith populationp_i = [ 
p_{i1} \ p_{i2} \ \vdots \ p_{ir} \]r ×1
all populationsp = [ 
p_1 \ p_2 \ \vdots \ p_s \]sr ×1
Variance of Probability Estimates
ith populationVi = [1/(ni)] (DIAG(pi) - pi pi')r ×r
all populationsV = DIAG(V1, V2, ... , Vs )sr ×sr
Response Functions
ith populationFi = F(pi)q ×1
all populationsF = [ 
F_1 \ F_2 \ \vdots \ F_s \]sq ×1
Derivative of Function with Respect to Probability Estimates
ith populationH_i = 
\displaystyle\frac{\partial F(p_i)}{\partial p_i}q ×r
all populationsH = DIAG(H1, H2, ... , Hs )sq ×sr
Variance of Functions
ith populationSi = Hi Vi Hi'q ×q
all populationsS = DIAG(S1, S2, ... , Ss )sq ×sq
Inverse Variance of Functions
ith populationSi = (Si)-1q ×q
all populationsS-1 = DIAG(S1, S2, ... , Ss )sq ×sq

Derivative Table for Compound Functions: Y=F(G(p))

In the following table, let G(p) be a vector of functions of p, and let D denote \partial G / \partial p, which is the first derivative matrix of G with respect to p.

Function Y = F(G) Derivative (\partial Y / \partial p)
Multiply matrixY = A*GA*D
LogarithmY = LOG(G)DIAG-1(G)*D
ExponentialY = EXP(G)DIAG(Y)*D
Add constantY = G + AD

Default Response Functions: Generalized Logits

In the following table, subscripts i for the population are suppressed. Also denote fj = log( [(pj)/(pr)] ) for j = 1, ... , r-1 for each population i = 1, ... , s.

Inverse of Response Functions for a Population
\displaystyle p_j & = & \displaystyle \frac{\exp (f_j)}{1 + \sum_k \exp (f_k)}  ...
 ...r } j = 1,  ...  , r-1 \
p_r & = & \displaystyle \frac{1}{1 + \sum_k \exp (f_k)}
Form of F and Derivative for a Population
F & = & {{K LOG}}(p) = 
(I_{r-1}, -j)  {LOG} (p) \
H & = & \displaystyle \frac{\partial F}{\partial p} =
( {DIAG}_{r-1}^{-1} (p), \frac{-1}{p_r} 
j )
Covariance Results for a Population
S & = & {HVH}{'} \
& = & \displaystyle \frac{1}n ( {DIAG}_{r-1}^{-1}(p) + 
 ... q \
F{'}S^{-1}F & = & \displaystyle 
n \sum_j p_j f_j^2 - n (\sum_j p_j f_j)^2

The following calculations are shown for each population and then for all populations combined.
Source Formula Dimension
Design Matrix
ith populationXiq ×d
all populationsX = [ 
X_1 \ X_2 \ \vdots \ X_s \]sq ×d
Crossproduct of Design Matrix
ith populationCi = Xi' Si Xid ×d
all populationsC = X{'} S^{-1} X =
\sum_i C_id ×d
Crossproduct of Design Matrix with Function
 R = X{'} S^{-1} F =
\sum_i X_i{'} S^i F_id ×1
Weighted Least-Squares Estimates
  b = C-1 R = (X' S-1 X)-1 (X' S-1 F)d ×1
Covariance of Weighted Least-Squares Estimates
 COV(b) = C-1d ×d
Predicted Response Functions
 {\hat{F}} = {Xb}sq ×1
Covariance of Predicted Response Functions
 {V_{\hat{F}}} = X{C}^{-1}X{'}sq ×sq
Residual Chi-Square
 RSS = F{'} S^{-1} F - 
{\hat{F}}{'} S^{-1} {\hat{F}}1 ×1
Chi-Square for H_0\colon L
 {\beta}= 0
 Q = (Lb)' (LC-1 L')-1 (Lb)1 ×1

Maximum Likelihood Method

Let C be the Hessian matrix and G be the gradient of the log-likelihood function (both functions of {\pi} and the parameters {\beta}). Let pi* denote the vector containing the first r-1 sample proportions from population i, and let {\pi}_i^* denote the corresponding vector of probability estimates from the current iteration. Starting with the least-squares estimates b0 of {\beta} (if you use the ML and WLS options; with the ML option alone, the procedure starts with 0), the probabilities {\pi}(b) are computed, and b is calculated iteratively by the Newton-Raphson method until it converges (see the EPSILON= option). The factor \lambda is a step-halving factor that equals one at the start of each iteration. For any iteration in which the likelihood decreases, PROC CATMOD uses a series of subiterations in which \lambda is iteratively divided by two. The subiterations continue until the likelihood is greater than that of the previous iteration. If the likelihood has not reached that point after ten subiterations, then convergence is assumed, and a warning message is displayed.

Sometimes, infinite parameters may be present in the model, either because of the presence of one or more zero frequencies or because of a poorly specified model with collinearity among the estimates. If an estimate is tending toward infinity, then PROC CATMOD flags the parameter as infinite and holds the estimate fixed in subsequent iterations. PROC CATMOD regards a parameter to be infinite when two conditions apply:

The estimator of the asymptotic covariance matrix of the maximum likelihood predicted probabilities is given by Imrey, Koch, and Stokes (1981, eq. 2.18).

The following equations summarize the method:

b_{k+1} = b_k - \lambda C^{-1} G
C & = & X{'}S^{-1}({\pi}) {X } \ 
N & = & [ n_1 ( p_1^* - {\pi}_1^* ) \ \vdots \n_s ( p_s^* - {\pi}_s^* ) \ ] \ 
& & \G & = & X{'}N \

Chapter Contents
Chapter Contents

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.