Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The CATMOD Procedure

Example 20.5: Log-Linear Model, Structural and Sampling Zeros

This example illustrates a log-linear model of independence, using data that contain structural zero frequencies as well as sampling (random) zero frequencies.

In a population of six squirrel monkeys, the joint distribution of genital display with respect to active or passive role was observed. The data are from Fienberg (1980, Table 8-2). Since a monkey cannot have both the active and passive roles in the same interaction, the diagonal cells of the table are structural zeros. See Agresti (1990) for more information on the quasi-independence model. Since there is only one population, the structural zeros are automatically deleted by PROC CATMOD. The sampling zeros are replaced in the DATA step by some positive number close to zero (1E-20). Also, the row for Monkey T is deleted since it contains all zeros; therefore, the cell frequencies predicted by a model of independence are also zero. In addition, the CONTRAST statement compares the behavior of the two monkeys labeled U and V. The following statements produce Output 20.5.1 through Output 20.5.8:

   title 'Behavior of Squirrel Monkeys';
   data Display;
      input Active $ Passive $ wt @@;
      if Active ne 'T';
      if Active ne Passive then 
         if wt=0 then wt=1e-20;
      datalines;
   R R  0   R S  1   R T  5   R U  8   R V  9   R W  0
   S R 29   S S  0   S T 14   S U 46   S V  4   S W  0
   T R  0   T S  0   T T  0   T U  0   T V  0   T W  0
   U R  2   U S  3   U T  1   U U  0   U V 38   U W  2
   V R  0   V S  0   V T  0   V U  0   V V  0   V W  1
   W R  9   W S 25   W T  4   W U  6   W V 13   W W  0
   ;

   proc catmod data=Display;
      weight wt;
      model Active*Passive=_response_
            / freq pred=freq noparm noresponse oneway;
      loglin Active Passive;
      contrast 'Passive, U vs. V' Passive 0 0 0 1 -1;
      contrast 'Active,  U vs. V' Active  0 0 1 -1;
      title2 'Test Quasi-Independence for the Incomplete Table';
   quit;

Output 20.5.1: Log-Linear Model Analysis with Zero Frequencies

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Response Active*Passive Response Levels 25
Weight Variable wt Populations 1
Data Set DISPLAY Total Frequency 220
Frequency Missing 0 Observations 25


The results of the ONEWAY option are shown in Output 20.5.2. Monkey T does not show up as a value for the Active variable since that row was removed.

Output 20.5.2: Output from the ONEWAY option

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

One-Way Frequencies
Variable Value Frequency
Active r 23
Active s 93
Active u 46
Active v 1
Active w 57
Passive r 40
Passive s 29
Passive t 24
Passive u 60
Passive v 64
Passive w 3

Output 20.5.3: Profiles

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Sample Sample Size
1 220

Response Profiles
Response Active Passive
1 r s
2 r t
3 r u
4 r v
5 r w
6 s r
7 s t
8 s u
9 s v
10 s w
11 u r
12 u s
13 u t
14 u v
15 u w
16 v r
17 v s
18 v t
19 v u
20 v w
21 w r
22 w s
23 w t
24 w u
25 w v


Sampling zeros are displayed as 1E-20 in Output 20.5.4. The Response Number corresponds to the value displayed in Output 20.5.2.

Output 20.5.4: Frequency of Response by Response Number

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Response Frequencies
Sample Response Number
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1 1 5 8 9 1E-20 29 14 46 4 1E-20 2 3 1 38 2 1E-20 1E-20 1E-20 1E-20 1 9 25 4 6 13

Output 20.5.5: Iteration History

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Analysis
Iteration Sub Iteration -2 Log
Likelihood
Convergence Criterion Parameter Estimates
1 2 3 4 5 6 7 8 9
0 0 1416.3054 1.0000 0 0 0 0 0 0 0 0 0
1 0 1238.2417 0.1257 -0.4976 1.1112 0.1722 -0.8804 -0.006978 0.0827 -0.4735 0.7287 0.5791
2 0 1205.1264 0.0267 -0.3420 1.0962 0.5612 -1.7549 0.2233 0.3899 -0.4086 0.7875 0.5728
3 0 1199.5068 0.004663 -0.1570 1.2687 0.7058 -2.3992 0.3034 0.4360 -0.3162 0.8812 0.6703
4 0 1198.6271 0.000733 -0.0466 1.3791 0.8170 -2.8422 0.3309 0.4625 -0.2890 0.9085 0.6968
5 0 1198.5611 0.0000551 -0.002748 1.4230 0.8609 -3.0176 0.3334 0.4649 -0.2866 0.9110 0.6992
6 0 1198.5603 6.5351E-7 0.002760 1.4285 0.8664 -3.0396 0.3334 0.4649 -0.2865 0.9110 0.6992
7 0 1198.5603 1.217E-10 0.002837 1.4285 0.8665 -3.0399 0.3334 0.4649 -0.2865 0.9110 0.6992

Maximum likelihood computations converged.

Output 20.5.6: Analysis of Variance Table

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Analysis of Variance
Source DF Chi-Square Pr > ChiSq
Active 4 56.58 <.0001
Passive 5 47.94 <.0001
Likelihood Ratio 15 135.17 <.0001


The analysis of variance table (Output 20.5.6) shows that the model of independence does not fit since the likelihood ratio test for the interaction is significant. In other words, active and passive behaviors of the squirrel monkeys are dependent behavior roles.

Output 20.5.7: Contrasts between Monkeys U and V

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Contrasts of Maximum Likelihood Estimates
Contrast DF Chi-Square Pr > ChiSq
Passive, U vs. V 1 1.31 0.2524
Active, U vs. V 1 14.87 0.0001


If the model fit these data, then the contrasts in Output 20.5.7 show that monkeys U and V appear to have similar passive behavior patterns but very different active behavior patterns.

Output 20.5.8: Response Function and Frequency Predicted Values

Behavior of Squirrel Monkeys
Test Quasi-Independence for the Incomplete Table

The CATMOD Procedure

Maximum Likelihood Predicted Values for Response Functions and Frequencies
Sample Active Passive Function
Number
Observed Predicted Residual
Function Standard
Error
Function Standard
Error
1     1 -2.5649494 1.03774904 -0.973554 0.33901898 -1.5913953
      2 -0.9555114 0.52623481 -1.7250404 0.34543788 0.76952896
      3 -0.4855078 0.44935852 -0.5275144 0.30925387 0.0420066
      4 -0.3677248 0.43362909 -0.7392682 0.24900568 0.37154345
      5 -48.616651 1E10 -3.560517 0.63410407 -45.056134
      6 0.80234647 0.33377513 0.32058886 0.2662902 0.48175761
      7 0.07410797 0.38516444 -0.2993416 0.29563358 0.37344956
      8 1.26369204 0.31410541 0.89818441 0.25085737 0.36550763
      9 -1.178655 0.57177187 0.6864306 0.17339604 -1.8650856
      10 -48.616651 1E10 -2.1348182 0.60807083 -46.481833
      11 -1.8718022 0.75955453 -0.2414953 0.28721789 -1.6303069
      12 -1.4663371 0.64051262 -0.1099394 0.30356781 -1.3563977
      13 -2.5649494 1.03774904 -0.8614257 0.31479379 -1.7035236
      14 1.0726368 0.32130806 0.12434644 0.20434511 0.94829036
      15 -1.8718022 0.75955453 -2.6969023 0.61743258 0.82510014
      16 -48.616651 1E10 -4.1478747 1.02450813 -44.468777
      17 -48.616651 1E10 -4.0163187 1.03006239 -44.600332
      18 -48.616651 1E10 -4.7678051 1.03245707 -43.848846
      19 -48.616651 1E10 -3.5702791 1.02079389 -45.046372
      20 -2.5649494 1.03774904 -6.6032817 1.16128927 4.03833233
      21 -0.3677248 0.43362909 -0.3658417 0.20295917 -0.001883
      22 0.65392647 0.34194017 -0.2342858 0.23279368 0.88821229
      23 -1.178655 0.57177187 -0.9857722 0.23940797 -0.1928828
      24 -0.7731899 0.49354812 0.21175381 0.18500696 -0.9849437
  r s F1 1 0.99772468 5.25950838 1.36156002 -4.2595084
  r t F2 5 2.21051208 2.48072585 0.6910659 2.51927415
  r u F3 8 2.77652497 8.21594841 1.85514611 -0.2159484
  r v F4 9 2.93799561 6.64804868 1.50931986 2.35195132
  r w F5 1E-20 1E-10 0.39576868 0.2402678 -0.3957687
  s r F6 29 5.01769596 19.1859928 3.14791495 9.81400723
  s t F7 14 3.62064786 10.321716 2.16959874 3.67828404
  s u F8 46 6.03173426 34.1846262 4.42870591 11.8153738
  s v F9 4 1.98173478 27.6609647 3.72278813 -23.660965
  s w F10 1E-20 1E-10 1.64670026 0.95271227 -1.6467003
  u r F11 2 1.40777064 10.936396 2.12321968 -8.936396
  u s F12 3 1.72020083 12.4740717 2.55433555 -9.4740717
  u t F13 1 0.99772468 5.8835826 1.3806555 -4.8835826
  u v F14 38 5.60681404 15.7672979 2.68469221 22.2327021
  u w F15 2 1.40777064 0.93865177 0.55164479 1.06134823
  v r F16 1E-20 1E-10 0.21996583 0.22177911 -0.2199658
  v s F17 1E-20 1E-10 0.2508934 0.25370612 -0.2508934
  v t F18 1E-20 1E-10 0.11833763 0.12031391 -0.1183376
  v u F19 1E-20 1E-10 0.39192393 0.39325479 -0.3919239
  v w F20 1 0.99772468 0.01887928 0.02172759 0.98112072
  w r F21 9 2.93799561 9.6576454 1.80865595 -0.6576454
  w s F22 25 4.70734436 11.0155266 2.27501884 13.9844734
  w t F23 4 1.98173478 5.19563797 1.18445235 -1.195638
  w u F24 6 2.41585671 17.2075014 2.77209793 -11.207501
  w v F25 13 3.49740163 13.9236886 2.24158038 -0.9236886


The lower section of the predicted-value table (Output 20.5.8) displays predicted cell frequencies (from the PRED=FREQ option), but since the model does not fit, these should be ignored.

Structural and Sampling Zeros with Raw Data

The preceding PROC CATMOD step uses cell count data as input. Prior to invoking the CATMOD procedure, structural and sampling zeros are easily identified and manipulated in a single DATA step. For the situation where structural or sampling zeros (or both) may exist and the input data set is raw data, use the following steps:
  1. Run PROC FREQ on the raw data. In the TABLES statement, list all dependent and independent variables separated by asterisks and use the SPARSE option and the OUT= option. This creates an output data set that contains all possible zero frequencies.
  2. Use a DATA step to change the zero frequencies associated with sampling zeros to a small value, such as 1E-20.
  3. Use the resulting data set as input to PROC CATMOD, and specify the statement WEIGHT COUNT to use adjusted frequencies.
For example, suppose the data set RawDisplay contains the raw data for the squirrel monkey data. The following statements show how to obtain the same analysis as shown previously:

   proc freq data=RawDisplay;
      tables Active*Passive / sparse out=Combos noprint;
   run;

   data Combos2;
      set Combos;
      if Active ne 'T';
      if Active ne Passive then 
         if count=0 then count=1e-20;
   run;

   proc catmod data=Combos2;
      weight count;
      model Active*Passive=_response_
            / freq pred=freq noparm noresponse;
      loglin Active Passive;
   quit;

The first IF statement in the DATA step is needed only for this particular example; since observations for Monkey T were deleted from the Display data set, they also need to be deleted from Combos2.

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.