Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The DATASOURCE Procedure

Example 10.2: BLS Consumer Price Index Surveys

This example compares changes of the prices in medical care services with respect to different regions for all urban consumers (SURVEY='CU') since May, 1975. The source of data is the Consumer Price Index Surveys distributed by the U.S. Department of Labor, Bureau of Labor Statistics.

An initial run of PROC DATASOURCE gives the descriptive information on different regions available (the OUTBY= data set), as well as the series variable name corresponding to medical care services (the OUTCONT= data set).

   filename datafile 'host-specific-file-name' <host-options>;
   proc datasource filetype=blscpi interval=month
                   outby=cpikey outcont=cpicont;
      where survey='CU';
   run;
   
   title1 'Partial Listing of the OUTBY= Data Set';
   proc print data=cpikey noobs;
      where upcase(areaname) in 
            ('NORTHEAST','NORTH CENTRAL','SOUTH','WEST');
   run;
   
   title1 'Partial Listing of the OUTCONT= Data Set';
   proc print data=cpicont noobs;
      where index( upcase(label), 'MEDICAL CARE' );
   run;

The OUTBY= data set in Output 10.2.1 lists all cross sections available for the four geographical regions: Northeast (AREA='0100'), North Central (AREA='0200'), Southern (AREA='0300'), and Western (AREA='0400'). The OUTCONT= data set gives the variable names for medical care related series.

Output 10.2.1: Partial Listings of the OUTBY= and OUTCONT= Data Sets

Partial Listing of the OUTBY= Data Set

survey season area basptype baseper st_date end_date ntime nobs nseries nselect surtitle areaname
CU U 0100 A DECEMBER 1977=100 DEC1966 JUL1990 284 284 1 1 ALL URBAN CONSUM NORTHEAST
CU U 0100 S 1982-84=100 DEC1966 JUL1990 284 284 90 90 ALL URBAN CONSUM NORTHEAST
CU U 0100 S DECEMBER 1982=100 DEC1982 JUL1990 92 92 7 7 ALL URBAN CONSUM NORTHEAST
CU U 0100 S DECEMBER 1986=100 DEC1986 JUL1990 44 44 1 1 ALL URBAN CONSUM NORTHEAST
CU U 0200 A DECEMBER 1977=100 DEC1966 JUL1990 284 284 1 1 ALL URBAN CONSUM NORTH CENTRAL
CU U 0200 S 1982-84=100 DEC1966 JUL1990 284 284 90 90 ALL URBAN CONSUM NORTH CENTRAL
CU U 0200 S DECEMBER 1982=100 DEC1982 JUL1990 92 92 7 7 ALL URBAN CONSUM NORTH CENTRAL
CU U 0200 S DECEMBER 1986=100 DEC1986 JUL1990 44 44 1 1 ALL URBAN CONSUM NORTH CENTRAL
CU U 0300 A DECEMBER 1977=100 DEC1966 JUL1990 284 284 1 1 ALL URBAN CONSUM SOUTH
CU U 0300 S 1982-84=100 DEC1966 JUL1990 284 284 90 90 ALL URBAN CONSUM SOUTH
CU U 0300 S DECEMBER 1982=100 DEC1982 JUL1990 92 92 7 7 ALL URBAN CONSUM SOUTH
CU U 0300 S DECEMBER 1986=100 DEC1986 JUL1990 44 44 1 1 ALL URBAN CONSUM SOUTH
CU U 0400 A DECEMBER 1977=100 DEC1966 JUL1990 284 284 1 1 ALL URBAN CONSUM WEST
CU U 0400 S 1982-84=100 DEC1966 JUL1990 284 284 90 90 ALL URBAN CONSUM WEST
CU U 0400 S DECEMBER 1982=100 DEC1982 JUL1990 92 92 7 7 ALL URBAN CONSUM WEST
CU U 0400 S DECEMBER 1986=100 DEC1986 JUL1990 44 44 1 1 ALL URBAN CONSUM WEST


Partial Listing of the OUTCONT= Data Set

name selected type length varnum label format formatl formatd
ASL5 1 1 5 . SERVICES LESS MEDICAL CARE   0 0
A0L5 1 1 5 . ALL ITEMS LESS MEDICAL CARE   0 0
A5 1 1 5 . MEDICAL CARE   0 0
A51 1 1 5 . MEDICAL CARE COMMODITIES   0 0
A512 1 1 5 . MEDICAL CARE SERVICES   0 0


The following statements make use of this information to extract the data for A512 and descriptive information on cross sections containing A512:

   proc format;
      value $areafmt '0100' = 'Northeast Region'
                     '0200' = 'North Central Region'
                     '0300' = 'Southern Region'
                     '0400' = 'Western Region';
   run;
   
   filename datafile 'host-specific-file-name' <host-options>;
   proc datasource filetype=blscpi interval=month
                   out=medical outall=medinfo;
      where survey='CU' and area in ( '0100','0200','0300','0400' );
      keep a512;
      range  from 1980:5;
      format area $areafmt.;
      rename a512=medcare;
   run;
   
   title1 'Information on Medical Care Service';
   proc print data=medinfo;
   run;

Output 10.2.2: Printout of the OUTALL= Data Set

Information on Medical Care Service

Obs survey season area basptype baseper length byselect name kept selected type varnum blknum label format formatl formatd st_date end_date ntime nobs ninrange surtitle areaname s_code units ndec
1 CU U Northeast Region S 1982-84=100 5 1 MEDCAR 1 1 1 7 3479 MEDICAL CARE SERVICES   0 0 DEC1977 JUL1990 152 152 123 ALL URBAN CONSUM NORTHEAST CUUR0100SA512   1
2 CU U North Central Region S 1982-84=100 5 1 MEDCAR 1 1 1 7 3578 MEDICAL CARE SERVICES   0 0 DEC1977 JUL1990 152 152 123 ALL URBAN CONSUM NORTH CENTRAL CUUR0200SA512   1
3 CU U Southern Region S 1982-84=100 5 1 MEDCAR 1 1 1 7 3677 MEDICAL CARE SERVICES   0 0 DEC1977 JUL1990 152 152 123 ALL URBAN CONSUM SOUTH CUUR0300SA512   1
4 CU U Western Region S 1982-84=100 5 1 MEDCAR 1 1 1 7 3776 MEDICAL CARE SERVICES   0 0 DEC1977 JUL1990 152 152 123 ALL URBAN CONSUM WEST CUUR0400SA512   1


Note that only the cross sections with BASEPER='1982-84=100' are listed in the OUTALL= data set (see Output 10.2.2). This is because only those cross sections contain data for MEDCARE.

The OUTALL= data set indicates that data values are stored with one decimal place (see the NDEC variable). Therefore, they need to be rescaled, as follows:

   data medical;
      set medical;
      medcare = medcare * 0.1;
   run;

The variation of MEDCARE against DATE with respect to different geographic regions can be demonstrated graphically, as follows:

Output 10.2.3: Plot of Time Series in the OUT= Data Set for FILETYPE=BLSCPI
datex02d.gif (6193 bytes)

This example illustrates the following features:

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.