Computational Formulas
The following calculations are shown for each
population and then for all populations combined.
Source

Formula

Dimension

Probability Estimates 
jth response  p_{ij} = [(n_{ij})/(n_{i})]  1 ×1 
ith population   r ×1 
all populations   sr ×1 
Variance of Probability Estimates 
ith population  V_{i} = [1/(n_{i})] (DIAG(p_{i})  p_{i} p_{i}')  r ×r 
all populations  V = DIAG(V_{1}, V_{2}, ... , V_{s} )  sr ×sr 
Response Functions 
ith population  F_{i} = F(p_{i})  q ×1 
all populations   sq ×1 
Derivative of Function with Respect
to Probability Estimates 
ith population   q ×r 
all populations  H = DIAG(H_{1}, H_{2}, ... , H_{s} )  sq ×sr 
Variance of Functions 
ith population  S_{i} = H_{i} V_{i} H_{i}'  q ×q 
all populations  S = DIAG(S_{1}, S_{2}, ... , S_{s} )  sq ×sq 
Inverse Variance of Functions 
ith population  S^{i} = (S_{i})^{1}  q ×q 
all populations  S^{1} = DIAG(S^{1}, S^{2}, ... , S^{s} )  sq ×sq 
Derivative Table for Compound Functions: Y=F(G(p))
In the following table, let G(p) be a
vector of functions of p, and let D
denote , which is the first
derivative matrix of G with respect to p.
Function

Y = F(G)

Derivative

Multiply matrix  Y = A*G  A*D 
Logarithm  Y = LOG(G)  DIAG^{1}(G)*D 
Exponential  Y = EXP(G)  DIAG(Y)*D 
Add constant  Y = G + A  D 
Default Response Functions: Generalized Logits
In the following table, subscripts i for the population are suppressed.
Also denote f_{j} = log( [(p_{j})/(p_{r})] )
for j = 1, ... , r1
for each population i = 1, ... , s.
Inverse of Response Functions for a Population 

Form of F and Derivative for a Population 

Covariance Results for a Population 

The following calculations are shown for each
population and then for all populations combined.
Source

Formula

Dimension

Design Matrix 
ith population  X_{i}  q ×d 
all populations   sq ×d 
Crossproduct of Design Matrix 
ith population  C_{i} = X_{i}' S^{i} X_{i}  d ×d 
all populations   d ×d 
Crossproduct of Design Matrix
with Function 
  d ×1 
Weighted LeastSquares Estimates 

b = C^{1} R = (X' S^{1} X)^{1} (X' S^{1} F)  d ×1 
Covariance of
Weighted LeastSquares Estimates 
 COV(b) = C^{1}  d ×d 
Predicted Response Functions 
  sq ×1 
Covariance of Predicted Response Functions 
  sq ×sq 
Residual ChiSquare 
 RSS  1 ×1 
ChiSquare for 
 Q = (Lb)' (LC^{1} L')^{1} (Lb)  1 ×1 
Maximum Likelihood Method
Let C be the Hessian matrix and G be the
gradient of the loglikelihood function (both functions
of and the parameters ).
Let p_{i}^{*} denote the vector containing the first
r1 sample proportions from population i, and let
denote the corresponding vector of
probability estimates from the current iteration.
Starting with the leastsquares estimates b_{0} of
(if you use the ML and WLS options; with the ML
option alone, the procedure starts with 0),
the probabilities are computed,
and b is calculated iteratively by the
NewtonRaphson method until it converges (see
the EPSILON= option).
The factor is a stephalving factor that
equals one at the start of each iteration.
For any iteration in which the likelihood
decreases, PROC CATMOD uses a series of subiterations
in which is iteratively divided by two.
The subiterations continue until the likelihood
is greater than that of the previous iteration.
If the likelihood has not reached that point
after ten subiterations, then convergence is
assumed, and a warning message is displayed.
Sometimes, infinite parameters may be present in the model, either
because of the presence of one or more zero frequencies or because
of a poorly specified model with collinearity among the estimates.
If an estimate is tending toward infinity, then PROC CATMOD
flags the parameter as infinite and holds the estimate
fixed in subsequent iterations. PROC CATMOD regards a
parameter to be infinite when two conditions apply:
 The absolute value of its estimate exceeds five
divided by the range of the corresponding variable.
 The standard error of its estimate is at least three
times greater than the estimate itself.
The estimator of the asymptotic covariance matrix of
the maximum likelihood predicted probabilities is
given by Imrey, Koch, and Stokes (1981, eq. 2.18).
The following equations summarize the method:
where
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.