Chapter Contents Previous Next
 The SURVEYREG Procedure

## Simple Random Sampling

Suppose that, in a junior high school, there are a total of 4,000 students in grades 7, 8, and 9. You want to know how household income and the number of children in a household affect students' average weekly spending for ice cream.

In order to answer this question, you draw a sample using simple random sampling from the student population in the junior high school. You randomly select 40 students and ask them their average weekly expenditure for ice cream, their household income, and the number of children in their household. The answers from the 40 students are saved as a SAS data set.

```   data IceCream;
input Grade Spending Income Kids @@;
datalines;
7   7  39  2   7   7  38  1   8  12  47  1
9  10  47  4   7   1  34  4   7  10  43  2
7   3  44  4   8  20  60  3   8  19  57  4
7   2  35  2   7   2  36  1   9  15  51  1
8  16  53  1   7   6  37  4   7   6  41  2
7   6  39  2   9  15  50  4   8  17  57  3
8  14  46  2   9   8  41  2   9   8  41  1
9   7  47  3   7   3  39  3   7  12  50  2
7   4  43  4   9  14  46  3   8  18  58  4
9   9  44  3   7   2  37  1   7   1  37  2
7   4  44  2   7  11  42  2   9   8  41  2
8  10  42  2   8  13  46  1   7   2  40  3
9   6  45  1   9  11  45  4   7   2  36  1
7   9  46  1
;
```

In the data set IceCream, the variable Grade indicates a student's grade. The variable Spending contains the dollar amount of each student's average weekly spending for ice cream. The variable Income specifies the household income, in thousands of dollars. The variable Kids indicates how many children are in a student's family.

The following PROC SURVEYREG statements request a regression analysis.

```   title1 'Ice Cream Spending Analysis';
title2 'Simple Random Sampling Design';
proc surveyreg data=IceCream total=4000;
class Kids;
model Spending = Income Kids / solution;
run;
```

The PROC SURVEYREG statement invokes the procedure. The TOTAL=4000 option specifies the total in the population from which the sample is drawn. The CLASS statement requests that the procedure use the variable Kids as a classification variable in the analysis. The MODEL statement describes the linear model that you want to fit, with Spending as the dependent variable and Income and Kids as the independent variables. The SOLUTION option in the MODEL statement requests that the procedure output the regression coefficient estimates.

 Ice Cream Spending Analysis Simple Random Sampling Design

 The SURVEYREG Procedure Regression Analysis for Dependent Variable Spending

 Data Summary Number of Observations 40 Mean of Spending 8.75000 Sum of Spending 350.00000

 Fit Statistics R-square 0.8132 Root MSE 2.4506 Denominator DF 39

 Class Level Information Class Variable Levels Values Kids 4 1 2 3 4
Figure 62.1: Summary of Data

Figure 62.1 displays the summary of the data, the summary of the fit, and the levels of the classification variable Kids. The "Fit Summary" table displays the denominator degrees of freedom, which are used in F tests and t tests in the regression analysis.

 Ice Cream Spending Analysis Simple Random Sampling Design

 The SURVEYREG Procedure Regression Analysis for Dependent Variable Spending

 ANOVA for Dependent Variable Spending Source DF Sum of Squares Mean Square F Value Pr > F Model 4 915.310 228.8274 38.10 <.0001 Error 35 210.190 6.0054 Corrected Total 39 1125.500

 Tests of Model Effects Effect Num DF F Value Pr > F Model 4 119.15 <.0001 Intercept 1 153.32 <.0001 Income 1 324.45 <.0001 Kids 3 0.92 0.4385

 NOTE: The denominator degrees of freedom for the F tests is 39.

Figure 62.2: Testing Effects in the Regression

Figure 62.2 displays the ANOVA table for the regression and the tests for model effects. The effect Income is significant in the linear regression model, while the effect Kids is not significant at the 5% level.

 Ice Cream Spending Analysis Simple Random Sampling Design

 The SURVEYREG Procedure Regression Analysis for Dependent Variable Spending

 Estimated Regression Coefficients Parameter Estimate Standard Error t Value Pr > |t| Intercept -26.084677 2.46720403 -10.57 <.0001 Income 0.775330 0.04304415 18.01 <.0001 Kids 1 0.897655 1.12352876 0.80 0.4292 Kids 2 1.494032 1.24705263 1.20 0.2381 Kids 3 -0.513181 1.33454891 -0.38 0.7027 Kids 4 0.000000 0.00000000 . .

 NOTE: The denominator degrees of freedom for the t tests is 39.Matrix X'X is singular and a generalized inverse was used to solve the normal equations. Estimates are not unique.

Figure 62.3: Regression Coefficients

The regression coefficient estimates and their standard errors and associated t tests are displayed in Figure 62.3.

 Chapter Contents Previous Next Top