Chapter Contents Previous Next
 The FREQ Procedure

## Example 28.2: Computing Chi-square Tests for One-Way Frequency Tables

This example examines whether the children's hair color (from Example 28.1) has a specified multinomial distribution for the two regions. The hypothesized distribution for hair color is 30% fair, 12% red, 30% medium, 25% dark, and 3% black.

In order to test the hypothesis for each region, the data are first sorted by Region. Then the FREQ procedure uses a BY statement to produce a separate table for each BY group (Region). The option ORDER=DATA orders the frequency table values (hair color) by their order in the data set. The TABLES statement requests a frequency table for hair color, and the option NOCUM suppresses the display of the cumulative frequencies and percentages. The TESTP= option specifies the hypothesized percentages for the chi-square test; the number of percentages specified equals the number of table levels, and the percentages sum to 100. The following statements produce Output 28.2.1.

```   proc sort data=Color;
by Region;
run;
proc freq data=Color order=data;
weight Count;
tables Hair/nocum testp=(30 12 30 25 3);
by Region;
title 'Hair Color of European Children';
run;
```

Output 28.2.1: One-way Frequency Table with BY Group

 Hair Color of European Children
 The FREQ Procedure
 Geographic Region=1
 Hair Color Hair Frequency Percent Test Percent fair 76 30.89 30.00 red 19 7.72 12.00 medium 83 33.74 30.00 dark 65 26.42 25.00 black 3 1.22 3.00

 Chi-Square Testfor Specified Proportions Chi-Square 7.7602 DF 4 Pr > ChiSq 0.1008

 Hair Color of European Children
 The FREQ Procedure
 Geographic Region=2
 Hair Color Hair Frequency Percent Test Percent fair 152 29.46 30.00 red 94 18.22 12.00 medium 134 25.97 30.00 dark 117 22.67 25.00 black 19 3.68 3.00

 Chi-Square Testfor Specified Proportions Chi-Square 21.3824 DF 4 Pr > ChiSq 0.0003

The frequency tables list the variable values (hair color) in the order in which they appear in the data set. The "Test Percent" column lists the hypothesized percentages for the chi-square test. Always check that you have ordered the TESTP= percentages to correctly match the order of the variable levels.

PROC FREQ computes a chi-square statistic for each region. The chi-square statistic is significant at the 0.05 level for Region 2 (p=0.0003) but not for Region 1. This indicates a significant departure from the hypothesized percentages in Region 2.

 Chapter Contents Previous Next Top