Chapter Contents Previous Next
 The SURVEYMEANS Procedure

## Example 61.2: Unequal Weighting

Quite often in complex surveys, respondents have unequal weights, which reflect unequal selection probabilities and adjustments for nonresponse and poststratification. In such surveys, the appropriate sampling weights must be used to obtain valid estimates for the study population. This example illustrates analysis of survey data with unequal sampling weights.

Suppose that economists want to study profiles of the 800 top-performing companies to provide information on their impact on the economy. A sample of 66 companies is selected with unequal probability.

```   data Company;
length Type \$14;
input Type\$ Asset Sale Value Profit Employee Weight;
datalines;
Other            2764.0  1828.0  1850.3   144.0   18.7   9.6
Energy          13246.2  4633.5  4387.7   462.9   24.3  42.6
Finance          3597.7   377.8    93.0    14.0    1.1  12.2
Transportation   6646.1  6414.2  2377.5   348.2   47.1  21.8
HiTech           1068.4  1689.8  1430.2    72.9    4.6   4.3
Manufacturing    1125.0  1719.4  1057.5    98.1   20.4   4.5
Other            1459.0  1241.4   452.7    24.5   20.1   5.5
Finance          2672.3   262.5   296.2    23.1    2.2   9.3
Finance           311.0   566.2   932.0    52.8    2.7   1.9
Energy           1148.6  1014.6   485.1    60.6    4.0   4.5
Finance          5327.0   572.4   372.9    25.2    4.2  17.7
Energy           1602.7   678.4   653.0    75.6    2.8   6.0
Energy           5808.8  1288.4  2007.0   318.8    5.9  19.2
Medical           268.8   204.4   820.9    45.6    3.7   1.8
Transportation   5222.6  2627.8  1910.0   245.6   22.8  17.4
Other             872.7  1419.4   939.3    69.7   12.2   3.7
Retail           4461.7  8946.8  4662.7   289.0  132.1  15.0
HiTech           6719.2  6942.0  8240.2   381.3   85.8  22.1
Retail            833.4  1538.8  1090.3    64.9   15.4   3.5
Finance           415.9   167.3  1126.8    56.8    0.7   2.2
HiTech            442.4  1139.9  1039.9    57.6   22.7   2.3
Other             801.5  1157.0   664.2    56.9   15.5   3.4
Finance          4954.8   468.8   366.4    41.7    3.0  16.5
Finance          2661.9   257.9   181.1    21.2    2.1   9.3
Finance          5345.8   530.1   337.4    36.4    4.3  17.8
Energy           3334.3  1644.7  1407.8   157.6    6.4  11.4
Manufacturing    1826.6  2671.7   483.2    71.3   25.3   6.7
Retail            618.8  2354.7   767.7    58.6   19.0   2.9
Retail           1529.1  6534.0   826.3    58.3   65.8   5.7
Manufacturing    4458.4  4824.5  3132.1    28.9   67.0  15.0
HiTech           5831.7  6611.1  9464.7   459.6   86.7  19.3
Medical          6468.3  4199.2  3170.4   270.1   59.5  21.3
Energy           1720.7   473.1   811.1    86.6    1.6   6.3
Energy           1679.7  1379.9   721.1    91.8    4.5   6.2
Retail           4018.2 16823.4  2038.3   178.1  162.0  13.6
Other             227.1   575.8  1083.8    62.6    1.9   1.6
Finance          3872.8   362.0   209.3    27.6    2.4  13.1
Retail           3359.3  4844.7  2651.4   224.1   75.6  11.5
Energy           1295.6   356.9   180.8   162.3    0.6   5.0
Energy           1658.0   626.6   688.0   126.0    3.5   6.1
Finance         12156.7  1345.5   680.7   106.6    9.4  39.2
HiTech           3982.6  4196.0  3946.8   313.9   64.3  13.5
Finance          8760.7   886.4  1006.9    90.0    7.5  28.5
Manufacturing    2362.2  3153.3  1080.0   137.0   25.2   8.4
Transportation   2499.9  3419.0   992.6    47.2   25.3   8.8
Energy           1430.4  1610.0   664.3    77.7    3.5   5.4
Energy          13666.5 15465.4  2736.7   411.4   26.6  43.9
Manufacturing    4069.3  4174.7  2907.6   289.2   38.2  13.7
Energy           2924.7   711.9  1067.8   146.7    3.4  10.1
Transportation   1262.1  1716.0   364.3    71.2   14.5   4.9
Medical           684.4   672.9   287.4    61.8    6.0   3.1
Energy           3069.3  1719.0  1439.0   196.4    4.9  10.6
Medical           246.5   318.8   924.1    43.8    3.1   1.7
Finance         11562.2  1128.5   580.4    64.2    6.7  37.3
Finance          9316.0  1059.4   816.5    95.9    8.0  30.2
Retail           1094.3  3848.0   563.3    29.4   44.7   4.4
Retail           1102.1  4878.3   932.4    65.2   47.3   4.4
HiTech            466.4   675.8   845.7    64.5    5.2   2.4
Manufacturing   10839.4  5468.7  1895.4   232.8   47.8  35.0
Manufacturing     733.5  2135.3    96.6    10.9    2.7   3.2
Manufacturing   10354.2 14477.4  5607.2   321.9  188.5  33.5
Energy           1902.1  2697.9   329.3    34.2    2.2   6.9
Other            2245.2  2132.2  2230.4   198.9    8.0   8.0
Transportation    949.4  1248.3   298.9    35.4   10.4   3.9
Retail           2834.4  2884.6   458.2    41.2   49.8   9.8
Retail           2621.1  6173.8  1992.7   183.7  115.1   9.2
;
```

The variable Type identifies the type of market for the company. The variable Asset contains the company's assets in millions of dollars. The variable Sale contains sales in millions of dollars. The variable Value contains the market value of the company in millions of dollars. The variable Profit contains the profit in millions of dollars. The variable Employee stores the number of employees in thousands, and the variable Weight contains the sampling weight. In this example, the sampling weights are reciprocals of the selection probabilities.

Using a probability sample design and the appropriate sampling weights, you can obtain statistically valid estimates for the study population. The following SAS statements compute estimates for this study.

```   title1 'Top Companies Profile Study';
title2 'Using Sampling Weights';
proc surveymeans data=Company total=800 mean sum;
var Asset Sale Value Profit Employee;
weight Weight;
run;
```

The TOTAL=800 option specifies the total number of companies in the study population. The statistic-keywords MEAN and SUM request estimates of the mean and total for the analysis variables. The WEIGHT statement identifies the sampling weight variable Weight. The VAR statement lists the variables to analyze.

Output 61.2.1: Company Profile Study

 Top Companies Profile Study Using Sampling Weights

 The SURVEYMEANS Procedure

 Data Summary Number of Observations 66 Sum of Weights 799.8

 Statistics Variable Mean Std Error of Mean Sum Std Dev Asset Sale Value Profit Employee 6523.488510 4215.995799 2145.935121 188.788210 36.874869 720.557075 839.132506 342.531720 25.057876 7.787857 5217486 3371953 1716319 150993 29493 1073829 847885 359609 30144 7148.003298

Output 61.2.1 shows that there are 66 observations in the sample. The sum of the sampling weights equals 799.8, which is close to the total number of companies in the study population.

The "Statistics" table in Output 61.2.1 displays the estimates of the mean and total for all analysis variables.

If you do not use the appropriate sampling weights, then the results of the analysis may be biased. For example, the following statements analyze the data without the sampling weights that reflect the unequal probabilities of selection.

```   title1 'Top Companies Profile Study';
title2 'Without Using the Sampling Weights';
proc surveymeans data=Company total=800 mean sum;
var Asset Sale Value Profit Employee;
run;
```

Output 61.2.2: Company Profile Study without Sampling Weights

 Top Companies Profile Study Without Using the Sampling Weights

 The SURVEYMEANS Procedure

 Data Summary Number of Observations 66

 Statistics Variable Mean Std Error of Mean Sum Std Dev Asset Sale Value Profit Employee 3557.753030 2881.306061 1517.507576 129.121212 27.704545 401.508963 407.864339 206.197430 13.824279 4.601392 234812 190166 100156 8522.000000 1828.500000 26500 26919 13609 912.402420 303.691848

Output 61.2.2 shows the results of the analysis without using the sampling weights. These results just summarize the sample of 66 companies, and they are not statistically valid estimates for the study population of 800 companies. These statistics are substantially different from the estimates shown in Output 61.2.1. For example, the total assets computed without sampling weights is only \$235 billion, compared to the estimate of \$5,217 billion computed with the sampling weights. The estimated mean of the assets is \$3.56 billion for the sample, but is estimated as \$6.52 billion for the study population.

 Chapter Contents Previous Next Top