![]() Chapter Contents |
![]() Previous |
![]() Next |
| The FREQ Procedure |
| Missing Values |
By default, PROC FREQ excludes missing values before it constructs the frequency and crosstabulation tables. PROC FREQ also excludes missing values before computing statistics. However, PROC FREQ displays the total frequency of observations with missing values below each table. The following options in the TABLES statement change how PROC FREQ handles missing values:
The OUT= option in the TABLES statement includes an observation in the output data set that contains the frequency of missing values. The NMISS keyword in the OUTPUT statement creates a variable in the output data set that contains the number of missing values.
Missing Values in Frequency Tables shows three ways that PROC FREQ handles missing values. The first table uses the default method; the second table uses MISSPRINT; and the third table uses MISSING.
Missing Values in Frequency Tables
When a combination of variable values for a crosstabulation is missing, PROC FREQ assigns zero to the frequency count for the table cell. By default, PROC FREQ omits missing combinations in list format and in the output data set that is created with a TABLES statement. To include the missing combinations, use SPARSE with LIST or OUT= in the TABLES statement.
PROC FREQ treats missing BY variable values like any other BY variable value. The missing values form a separate BY group. When the value of a WEIGHT variable is missing, PROC FREQ excludes the observation from the analysis.
| Procedure Output |
By default, a one-way table lists the variable name, variable values, frequency counts, percentages, cumulative frequency counts, cumulative percentages, and the number of missing values. Unless you use LIST in the TABLES statement, a two-way table appears as a crosstabulation table. An n-way table appears as multiple crosstabulation tables with one table for each combination of values for the stratification variables. By default, each cell of a crosstabulation table lists the frequency count, percentage of the total frequency count, row percentage, and column percentage.
Use the following TABLES statement options to report additional information for each table cell:
By default, PROC FREQ displays the next one-way frequency table on the
current page when there is enough space to display the entire table. If you
use COMPRESS in the PROC FREQ statement, the next one-way table starts to
display on the current page even when the entire table will not fit. If you
use PAGE in the PROC FREQ statement, each frequency or crosstabulation table
always displays on a separate page.
By default, PROC FREQ uses the BEST6. format to display a cell frequency when the frequency is less than 1E6. Otherwise, it uses the BEST7. format so that frequency values with more than seven significant digits display in scientific notation (E format). The V5FMT option in the TABLES statement uses BEST8. format so that frequency values with more than eight significant digits display in scientific notation.
When scientific notation is used, only the first few significant digits are shown. If you need more significant digits than PROC FREQ displays, create an output data set by specifying OUT= in the TABLES statement. Then use PROC PRINT and assign an appropriate format to the variable COUNT. For example, the statement
format count 10.;displays exact integer counts up to 9999999999. For more information about formats, see the section on components of the SAS Language in SAS Language Reference: Concepts.
The NOPRINT option in the PROC FREQ statement and NOPRINT, NOCOL, NOCUM, NOFREQ, NOPERCENT, and NOROW in the TABLES statement suppress displayed output. Use NOPRINT in the PROC FREQ statement to suppress all displayed output as well as the Output Delivery System. Use NOPRINT in the TABLES statement to suppress frequency and crosstabulation tables but still display the requested statistics. Use NOCOL, NOCUM, NOFREQ, NOPERCENT, and NOROW to suppress various frequencies and percentages in the frequency and crosstabulation tables.
![[cautend]](../common/images/cautend.gif)
| Output Data Sets |
PROC FREQ produces two types of output data sets that you can use with other statistical and reporting procedures. These data sets are produced as follows:
PROC FREQ does not display the output data set. Use PROC PRINT, PROC
REPORT, or any other SAS reporting tool to display the output data set.
The OUT= option in the TABLES statement creates an output data set that contains one observation for each combination of the variable values in the last table request. By default, each observation contains the frequency and percentage for each combination of variable values. When the input data set contains missing values, the output data set contains an observation with the frequency of missing values. The output data set includes the following variables:
If you use OUTEXPECT and OUTPCT, the output data set also contains expected frequencies and row, column, and table percentages, respectively. The additional variables are
When you submit the following statements
proc freq; tables a a*b / out=d; run;the output data set D contains frequencies and percentages for the last table request, A*B. If A has two levels (1 and 2), B has three levels (1, 2, and 3), and no table cell count is zero or missing, the output data set D includes six observations, one for each combination of A and B. The first observation corresponds to A=1 and B=1; the second observation corresponds to A=1 and B=2; and so on. The data set also includes the variables COUNT and PERCENT. The value of COUNT is the number of observations that have the given combination of A and B values. The value of PERCENT is the percent of the total number of observations having that A and B combination.
When PROC FREQ combines different variable values into the same formatted level, the output data set contains the smallest internal value for the formatted level. For example, suppose a variable X has the values 1.1, 1.4, 1.7, 2.1, and 2.3. When you submit the statement
format x 1.;in a PROC FREQ step, the formatted levels listed in the frequency table for X are 1 and 2. If you create an output data set with the frequency counts, the internal values of X are 1.1 and 1.7. To report the internal values of X when you display the output data set, use a format of 3.1 with X.
The OUTPUT statement creates a SAS data set that contains the statistics that PROC FREQ computes for the last table request. You specify which statistics to store in the output data set. There is an observation with the specified statistics for each stratum or two-way table. If PROC FREQ computes summary statistics for a stratified table, the output data set also contains a summary observation for these statistics. Additionally, you can output statistics for one-way tables, such as chi-square or binomial proportion statistics. If you use a BY statement, the output data set contains observations for each BY group.
The output data set can include the following variables:
The output data set also includes variables with the p-value and degrees of freedom, asymptotic standard error (ASE), or confidence bounds when PROC FREQ computes these values for a specified statistic.
The variable names for the specified statistics in the output data set are the names of the keywords that are enclosed in underscores. PROC FREQ forms variable names for the corresponding p-values, degrees of freedom, or confidence bounds by combining the name of the keyword with one of the following prefixes
| DF_ | degrees of freedom |
| E_ | asymptotic standard error (ASE) |
| E0_ | asymptotic standard error under the null hypothesis |
| L_ | lower confidence bound |
| P_ | p-value |
| P2_ | two-sided p-value |
| PL_ | left-sided p-value |
| PR_ | right-sided p-value |
| U_ | upper confidence bound |
| XP_ | exact p-value |
| XP2_ | exact two-sided p-value |
| XPR_ | exact right-sided p-value |
| XPL_ | exact left-sided p-value |
| XL_ | exact lower confidence bound |
| XU_ | exact upper confidence bound |
| Z_ | standardized value |
![]() Chapter Contents |
![]() Previous |
![]() Next |
![]() Top of Page |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.