Chapter Contents 
Previous 
Next 
The SURVEYMEANS Procedure 
The example in the section "Stratified Sampling" assumes that the sample of students was selected using a stratified simple random sampling design. This example shows analysis based on a more complex sample design.
Suppose that every student belongs to a study group and that study groups are formed within each grade level. Each study group contains between 2 and 4 students. Table 61.3 shows the total number of study groups for each grade.
Table 61.3: Study Groups and Students by GradeGrade  Number of Study Groups  Number of Students 
7  608  1,824 
8  252  1,025 
9  403  1,151 
Total  617  4,000 
It is quicker and more convenient to collect data from students in the same study group than to collect data from students individually. Therefore, this study uses a stratified clustered sample design. The primary sampling units, or clusters, are study groups. The list of all study groups in the school is stratified by grade level. From each grade level, a sample of study groups is randomly selected, and all students in each selected study group are interviewed. The sample consists of 8 study groups from the 7th grade, 3 groups from the 8th grade, and 5 groups from the 9th grade.
The SAS data set named IceCreamStudy saves the responses of the selected students.
data IceCreamStudy; input Grade StudyGroup Spending @@; if (Spending < 10) then Group='less'; else Group='more'; datalines; 7 34 7 7 34 7 7 412 4 9 27 14 7 34 2 9 230 15 9 27 15 7 501 2 9 230 8 9 230 7 7 501 3 8 59 20 7 403 4 7 403 11 8 59 13 8 59 17 8 143 12 8 143 16 8 59 18 9 235 9 8 143 10 9 312 8 9 235 6 9 235 11 9 312 10 7 321 6 8 156 19 8 156 14 7 321 3 7 321 12 7 489 2 7 489 9 7 78 1 7 78 10 7 489 2 7 156 1 7 78 6 7 412 6 7 156 2 9 301 8 ;
In the data set IceCreamStudy, the variable Grade contain a student's grade. The variable StudyGroup identifies a student's study group. It is possible for students from different grades to have the same study group number because study groups are sequentially numbered within each grade. The variable Spending contains a student's response to how much he spends per week for ice cream, in dollars. The variable GROUP indicates whether a student spends at least $10 weekly for ice cream. It is not necessary to store the data in order of grade and study group.
The SAS data set StudyGroup is created to provide PROC SURVEYMEANS with the sample design information shown in Table 61.3.
data StudyGroups; input Grade _total_; datalines; 7 608 8 252 9 403 ;
The variable Grade identifies the strata, and the variable _TOTAL_ contains the total number of study groups in each stratum. As discussed in the section "Specification of Population Totals and Sampling Rates", the population totals stored in the variable _TOTAL_ should be expressed in terms of the primary sampling units (PSUs), which are study groups in this example. Therefore, the variable _TOTAL_ contains the total number of study groups for each grade, rather than the total number of students.
The following SAS statements perform the analysis for this sample design.
title1 'Analysis of Ice Cream Spending'; title2 'Stratified Clustered Sample Design'; proc surveymeans data=IceCreamStudy total=StudyGroups; stratum Grade / list; cluster StudyGroup; var Spending Group; run;Output 61.1.1: Data Summary and Class Information


Chapter Contents 
Previous 
Next 
Top 
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.