The GMAP procedure requires a map data set and a response
data set. These two data sets must contain the required variables or the procedure
stops with an error message. You can use the same data set as both the map
data set and the response data set, as long as the requirements are met. If
a different data set is used as the response data set, it must contain an
ID variable that is identical to the ID variable in the map data set.
A map data set is a SAS data
set that contains
coordinates that define the boundaries of map areas, such as states or counties.
A map data set must contain at least these variables:
The X and Y variable values in the map data set do not
have to be in any specific units because they are rescaled by the GMAP procedure
based on the minimum and maximum values in the data set. The minimum X and
Y values are in the lower-left corner of the map, and the maximum X and Y
values are in the upper-right corner.
Map data sets in which the X and Y variables contain
longitude and latitude should be projected before you use them with PROC GMAP.
See The GPROJECT Procedure
for details.
Optionally, the map data set also can contain a variable
named SEGMENT to identify map areas that comprise noncontiguous polygons.
Each unique value of the SEGMENT variable within a single map area defines
a distinct polygon. If the SEGMENT variable is not present, each map area
is drawn as a separate closed polygon that indicates a single segment.
The observations for each segment of a map area in the
map data set must occur in the order in which the points are to be joined.
The GMAP procedure forms map area outlines by connecting the boundary points
of each segment in the order in which they appear in the data set, eventually
joining the last point to the first point to complete the polygon.
Any variables in the map data set other than the ones
mentioned above are ignored for the purpose of determining map boundaries.
In addition to the variables described in
About Map Data Sets, the SAS/GRAPH map
data sets may also contain the following variables:
The GMAP procedure uses the values of the X and Y variables
to draw the map. Therefore, if you want to produce an unprojected map by
using the values in LONG and LAT, you would have to rename LONG and LAT to
X and Y first.
SAS/GRAPH includes
a number of predefined map data sets. These data sets are described in SAS/GRAPH Map Data Sets.
Most Institute-supplied map data sets
contain four coordinate variables (X, Y, LONG, and LAT). In this case, X
and Y are always projected values that will be used by the GRAPH procedures
(by default). If you need to use the unprojected values that are contained
in the LONG and LAT variables, you will need to rename the LONG and LAT variables
to X and Y since the GMAP procedure automatically uses X and Y. See Input Map Data Sets that Contain Both Projected and Unprojected Values
for more details.
The Institute-supplied map data sets that
contain X and Y variables (and no LONG and LAT variables), are usually projected
maps. However, there are a few map data sets for the US and Canada that contain
X and Y values that are unprojected longitude and latitude. In this case,
you will need to use the GPROJECT procedure to project the map (see The GPROJECT Procedure).
Note:
You can determine
whether a SAS map data set is projected
or unprojected by looking at the description of each variable that is displayed
when you use the CONTENTS procedure or by browsing the MAPS.METAMAPS data
set. ![[cautend]](../common/images/cautend.gif)
There are several map data sets available
with SAS/GRAPH that allow you to easily
label maps:
-
MAPS.USCENTER
-
contains the X and Y coordinates
of the visual center of each state in the U.S. and Washington, D.C., as well
as points in the ocean for states that are too small to contain a label. You
can use MAPS.USCENTER with the MAPS.US, MAPS.USCOUNTY, MAPS.COUNTIES, and
MAPS.COUNTY data sets.
-
MAPS.USCITY
-
contains the X and Y coordinates
of selected cities in the U.S. Many city names occur in more than one state,
so you may have to subset by state to avoid duplication. You can use MAPS.USCITY
with the MAPS.US, MAPS.USCOUNTY, MAPS.COUNTIES, and MAPS.COUNTY data sets.
-
MAPS.CANCENS
-
contains the names of the Canadian
census divisions. You can use MAPS.CANCENS with the MAPS.CANADA and MAPS.CANADA3
data sets.
See the MAPS.METAMAPS data set for details on each of
the Institute-supplied map data sets.
A
response data set is a SAS data set that contains
The response data set can contain other variables in
addition to these required variables.
The values of the map area identification variables
in the response data set determine the map areas to be included on the map
unless you use the ALL option in the PROC GMAP statement. That is, unless
you use ALL in the PROC GMAP statement, only the map areas with response values
are shown on the map. As a result, you do not need to subset your map data
set if you are mapping only a small section of the map. However, if you map
the same small section frequently, create a subset of the map data set for
efficiency.
For choropleth, block, and prism maps, the response
variables can be either character or numeric. For surface maps, the response
variables must be numeric with only positive values.
The GMAP procedure can produce block, choropleth,
and prism maps for both numeric and character response variables. Numeric
variables fall into two categories: discrete and continuous.
Numeric response variables are always treated as continuous
variables unless the DISCRETE option is used in the action statement.
Response levels are
the values that identify categories of data on the graph. The categories
that are shown on the graph are based on the values of the response variable.
Based on the type of the response variable, a response level can represent
these values:
The BLOCK,
CHORO, and PRISM statements assign patterns to response
levels. In CHORO and PRISM maps, response levels are shown as map areas. However,
in BLOCK maps, response levels are shown as blocks. The default fill pattern
for the response level is solid.
PATTERN statements can define the fill patterns and
colors for both blocks and map areas. PATTERN definitions that define valid
block patterns are applied to the blocks (response levels), and PATTERN definitions
that define valid map patterns are applied to map areas.
See PATTERN Statement
for more information on fill pattern values and default pattern rotation.
Identification (ID) variables are common to both the map data set and
the response data set. They identify the map areas (for example, counties,
states, or provinces) that make up the map. A unit area or map area is a group of observations with the same ID value. The GMAP
procedure matches the value of the response variables for each map area in
the response data set to the corresponding map area in the map data set to
create the output graphs.
Whether the GMAP procedure draws a map area and whether it displays patterns
for response values depends on the contents of the response data set and on
the ALL and MISSING options. Displaying Map Areas and Response Data
describes the conditions under which the procedure does or does not display
map areas and response data.
To use the GMAP procedure, you must do the
following:
-
If necessary, issue a LIBNAME statement for the
SAS data library that contains the map data set that you want to display.
-
Determine what processing needs to be done to
the map data set before it is displayed. Use the GPROJECT, GREDUCE, and GREMOVE
procedures or a DATA step to perform the necessary processing.
-
Issue a LIBNAME statement for the SAS data set
that contains the response data set, or use a DATA step to create a response
data set.
-
Use the PROC GMAP statement to identify the map
and response data sets.
-
Use the ID statement to name the identification
variable(s).
-
Use a BLOCK, CHORO, PRISM, or SURFACE statement
to identify the response variable and generate the map.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.