Example 28.4: Analyzing a 2x2 Contingency Table
This example computes chisquare tests and Fisher's exact
test to compare the probability of coronary heart disease
for two types of diet. It also estimates the relative risks
and computes exact confidence limits for the odds ratio.
The data set FatComp contains hypothetical data for a
casecontrol study of high fat diet and the risk of coronary
heart disease. The data are recorded as cell counts, where
the variable Count contains the frequencies for each
exposure and response combination. The data is sorted in
descending order by the variables Exposure and
Response, so that the first cell of the 2×2 table
contains the frequency of positive exposure and positive
response. The FORMAT procedure creates formats to
identify the type of exposure and response with character
values.
proc format;
value ExpFmt 1='High Cholesterol Diet'
0='Low Cholesterol Diet';
value RspFmt 1='Yes'
0='No';
run;
data FatComp;
input Exposure Response Count;
label Response='Heart Disease';
datalines;
0 0 6
0 1 2
1 0 4
1 1 11
;
proc sort data=FatComp;
by descending Exposure descending Response;
run;
In the following statements, the TABLES statement creates a
twoway table, and the option ORDER=DATA orders the
contingency table values by their order in the data set.
The CHISQ option produces several chisquare tests, while
the RELRISK option produces relative risk measures. The
EXACT statement creates the exact Pearson chisquare test
and exact confidence limits for the odds ratio. These
statements produce Output 28.4.1 through Output 28.4.3.
proc freq data=FatComp order=data;
weight Count;
tables Exposure*Response / chisq relrisk;
exact pchi or;
format Exposure ExpFmt. Response RspFmt.;
title 'CaseControl Study of High Fat/Cholesterol Diet';
run;
Output 28.4.1: Contingency Table
CaseControl Study of High Fat/Cholesterol Diet 
Frequency Percent Row Pct Col Pct 

Table of Exposure by Response 
Exposure 
Response(Heart Disease) 
Total 
Yes 
No 
High Cholesterol Diet 
11 47.83 73.33 84.62 
4 17.39 26.67 40.00 
15 65.22 
Low Cholesterol Diet 
2 8.70 25.00 15.38 
6 26.09 75.00 60.00 
8 34.78 
Total 
13 56.52 
10 43.48 
23 100.00 


The contingency table in Output 28.4.1 displays the variable
values so that the first table cell contains the frequency
for the first cell in the data set, the frequency of positive
exposure and positive response.
Output 28.4.2: ChiSquare Statistics
CaseControl Study of High Fat/Cholesterol Diet 
Statistics for Table of Exposure by Response 
Statistic 
DF 
Value 
Prob 
ChiSquare 
1 
4.9597 
0.0259 
Likelihood Ratio ChiSquare 
1 
5.0975 
0.0240 
Continuity Adj. ChiSquare 
1 
3.1879 
0.0742 
MantelHaenszel ChiSquare 
1 
4.7441 
0.0294 
Phi Coefficient 

0.4644 

Contingency Coefficient 

0.4212 

Cramer's V 

0.4644 

WARNING: 50% of the cells have expected counts less than 5. (Asymptotic) ChiSquare may not be a valid test. 
Pearson ChiSquare Test 
ChiSquare 
4.9597 
DF 
1 
Asymptotic Pr > ChiSq 
0.0259 
Exact Pr >= ChiSq 
0.0393 
Fisher's Exact Test 
Cell (1,1) Frequency (F) 
11 
Leftsided Pr <= F 
0.9967 
Rightsided Pr >= F 
0.0367 


Table Probability (P) 
0.0334 
Twosided Pr <= P 
0.0393 

Since the expected counts in
some of the cells are small, PROC FREQ displays a warning
that the asymptotic chisquare tests may not be appropriate.
In this case, the exact tests in
Output 28.4.2 are appropriate. The alternative hypothesis
for this analysis states that coronary heart disease is more
likely to be associated with a high fat diet, so a
onesided test is desired. Fisher's exact rightsided
test analyzes whether the probability of heart disease in the
high fat group exceeds the probability of heart disease in
the low fat group; since this pvalue is small, the
alternative hypothesis is supported.
Output 28.4.3: Relative Risk
CaseControl Study of High Fat/Cholesterol Diet 
Statistics for Table of Exposure by Response 
Estimates of the Relative Risk (Row1/Row2) 
Type of Study 
Value 
95% Confidence Limits 
CaseControl (Odds Ratio) 
8.2500 
1.1535 
59.0029 
Cohort (Col1 Risk) 
2.9333 
0.8502 
10.1204 
Cohort (Col2 Risk) 
0.3556 
0.1403 
0.9009 
Odds Ratio (CaseControl Study) 
Odds Ratio 
8.2500 


Asymptotic Conf Limits 

95% Lower Conf Limit 
1.1535 
95% Upper Conf Limit 
59.0029 


Exact Conf Limits 

95% Lower Conf Limit 
0.8677 
95% Upper Conf Limit 
105.5488 

The odds ratio, displayed in Output 28.4.3, provides an
estimate of the relative risk when an event is rare. This
estimate indicates that the odds of heart disease is 8.25
times higher in the high fat diet group; however, the wide
confidence limits indicate that this estimate has low precision.
