Chapter Contents |
Previous |
Next |

The EXPAND Procedure |

It is important to distinguish between variables that are measured at
points in time and variables that represent totals or averages over an interval.
Point-in-time values are often called *stocks* or *levels*.
Variables that represent totals or averages over an interval are
often called *flows* or *rates*.
For example, the annual series "U.S. Gross Domestic Product" represents the
total value of production over the year and also the yearly average rate of
production in dollars per year.
However, a monthly variable *inventory* may represent the cost
of a stock of goods as of the end of the month.
When the data represent periodic totals or averages,
the process of interpolation to a higher frequency
is sometimes called *distribution*,
and the total values of the larger intervals are said to be
*distributed* to the smaller intervals.
The process of interpolating periodic total or average values
to lower frequency estimates is sometimes called *aggregation*.
By default, PROC EXPAND assumes that all time series represent
beginning-of-period point-in-time values.
If a series does not measure beginning of period point-in-time values,
interpolation of the data values using this assumption is not appropriate,
and you should specify the correct observation characteristics of the series.
The observation characteristics of series are specified with the OBSERVED=
option on the CONVERT statement.

For example, suppose that the data set ANNUAL contains variables A, B, and C that measure yearly totals, while the variables X, Y, and Z measure first-of-year values. The following statements estimate the contribution of each month to the annual totals in A, B, and C, and interpolate first-of-month estimates of X, Y, and Z.

proc expand data=annual out=monthly from=year to=month; id date; convert x y z; convert a b c / observed=total; run;

The EXPAND procedure supports five different observation characteristics. The OBSERVED= option values for these five observation characteristics are:

- BEGINNING
- beginning-of-period values
- MIDDLE
- period midpoint values
- END
- end-of-period values
- TOTAL
- period totals
- AVERAGE
- period averages

The interpolation of each series is adjusted appropriately for its observation characteristics. When OBSERVED=TOTAL or AVERAGE is specified, the interpolating curve is fit to the data values so that the area under the curve within each input interval equals the value of the series. For OBSERVED=MIDDLE or END, the curve is fit through the data points, with the time position of each data value placed at the specified offset from the start of the interval.

See the section "The OBSERVED= Option" on this page for details.

Chapter Contents |
Previous |
Next |
Top |

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.