|Using Spatial Data with SAS/GIS Software|
A SAS/GIS spatial database consists of a set of SAS data sets that store the spatial data and a set of SAS catalog entries that define the functions of, and the relationships between, the spatial data elements.
|Spatial Data Sets|
As a component of the SAS System, SAS/GIS software stores all its spatial data in SAS data sets. The data sets for a SAS/GIS spatial database work together as one logical file, even though they are split into multiple physical files.
The spatial data sets implement a network data structure
with links that connect chains to their two end nodes and each node to one
or more chains. This structure is implemented by using direct pointers between
the nodes and chains data sets. The details data set provides curvature points
between nodes of chains, while the polygonal index data set provides an efficient
method of determining the correct sequence of chains to represent polygons.
The following spatial data variables appear in the chains, nodes, and details data sets:
|ROW||row number (used as a link when the spatial data set is used as a keyed data set as well as for database protection)|
|DATE||SAS datetime value when the record was last modified|
|VERSION||data version number|
|ATOM||edit operation number|
|HISTORY||undo history record pointer|
The following linkages exist between and within the spatial data sets:
|Data Set||Variable||Links to...|
|chains||ROW (table note 1)||self|
|FRNODE||starting from-node record in the nodes data set|
|TONODE||starting to-node record in the nodes data set|
|D_ROW||starting detail record in the details data set|
|C_ROW1-C_ROW5||chain records in the chains data set|
|NC (table note 2)||node record in the nodes data set used to store additional chain records|
|C_ROW||parent chain record in the chains data set or next detail continuation record in the details data set|
|index (table note 3)||C_ROW||starting chain record in the chains data set|
The ROW variable is used as a link between records in the spatial data sets. The ROW variable value for the first record of a feature in the chains or nodes data sets is considered the feature ID
. Because some records in the nodes data set are continuations of other records, not every row number in the nodes data set is a feature ID. As a result, node feature ID numbers are not necessarily sequential.
The ROW variable also provides protection against corruption of the database that is caused by the accidental insertion or deletion of records. If records were linked by physical record number rather than by ROW value, an improper record insertion or deletion would throw off all linkages to subsequent records in the database. In the event the database is corrupted, the ROW variable can be used to move the records back into their proper locations with minimal data loss.
A negative value indicates that the variable points to a continuation record. The absolute value of the variable is the row number of the next record used to that feature's data. In newly imported spatial data, continuation records always point to the next record in the data set, but this is not required. New chains can be attached to existing nodes without having to insert records, which would require extensive printer reassignments.
The index data set has no ROW variable because it can be easily rebuilt from
the chains, nodes, and details data sets from which it was originally constructed.
Because the data
sets are linked together by row number,
the chains, nodes, and details data sets must be radix-addressable and may
not be compressed.
The chains data set contains coordinates for the polylines that are used to form line and polygon features. (A polyline consists of either a single line segment or a series of connected line segments.) The chains data set also contains the information that is necessary to implement nodes in the database.
The following system variables are unique to the chains data set:
|FRNODE (table note 1)||starting from-node record.|
|TONODE||starting to-node record.|
|D_ROW||starting detail point record.|
|ND||number of detail points in the chain.|
|RANK||sorting key used to sort all the chains around an arbitrary
node by their angle, starting from 0 and proceeding counter-clockwise.
RANK values have the form ffffff.tttttt, where the ffffff component is used to sort the chain around its from-node and the tttttt component is used to sort the chain around its to-node. The ffffff and tttttt components are calculated using the following formula:
The tangent term is called the half-angle tangent. Since the angle A/2 can never exceed /4 (45 degrees), the half-angle tangent has values from 0 to 1. The (Q-1) multiplier adjusts the range of values to 0 to 4. The values 0, 1, 2, 3, and 4- represent angles of 0, 90, 180, 270, and just under 360 degrees, respectively.
The 1E5 multiplier is used to transform decimal rank values to integers. Thus the rank values for a chain have six significant digits.
|XMIN||mininum X coordinate of chain.|
|XMAX||maximum X coordinate of chain.|
|YMIN||minimum Y coordinate of chain.|
|YMAX||maximum y coordinate of chain.|
The TONODE and FRNODE variables can point to the same record.
The XMIN, YMIN, XMAX, and YMAX variables define a bounding box for the chain. These variables are included in the chains data set to make it possible to select all the chains in a given X-Y region by looking only at the chains data set.
In addition to the system variables, the chains data
set may also contain any number of attribute variables, some of which may
be polygon IDs. Since the chains have sides, there are typically paired variables
for bilateral data such as polygon areas or address values. The names of the
paired variables typically end with
R for the left and right sides, respectively. For example,
the data set may contain COUNTYL and COUNTYR variables with the codes for
the county areas on the left and right sides of the chain, respectively. However,
this naming convention is not required.
The nodes data set contains the coordinates of the nodes for the chains in the chains data set and the linkage information that is necessary to attach chains to the correct nodes. A node definition may span multiple records in the nodes data set, so only the starting record number for a node is a node feature ID.
The following system variables are unique to the nodes data set:
|C_ROW1-C_ROW5||Chain records for the first five chains connected to the node. If fewer than five chains are connected to the node, the unused variables are set to 0.|
|NC||number of chain pointers (if five or fewer chains are connected to the node) or the negative of the next continuation node record number (if more than five chains are connected to the node). See Variable Linkages in the Spatial Data for more information about how NC is used to string continuation node records.|
|X||X coordinate of node.|
|Y||Y coordinate of node.|
The details data set stores curvature points of a chain between the two end nodes. Therefore, it contains all the coordinates between the intersection points of the chains. The node coordinates are not duplicated in the details data set.
The following system variables are unique to the details data set:
|C_ROW||parent chain record (if the chain has ten or fewer detail points) or the negative of the next continuation detail record (if the chain has more than ten detail points). See Variable Linkages in the Spatial Data for a description of how C_ROW is used to string continuation detail records.|
|X1-X10||X coordinates of up to 10 detail points.|
|Y1-Y10||Y coordinates of up to 10 detail points.|
pairs (X2, Y2) through (X10, Y10)
contain missing values if they are not used. The missing values ensure that
the unused coordinate pairs are never used in any coordinate range calculation.
The various importing methods set unused detail coordinates to missing as
a precautionary measure.
Polygonal indexes are indexes to chains data sets. The index contains a record for each boundary of each polygon that was successfully closed in the index creation process. The same rules that are used to construct polygons are also used to construct polygonal indexes.
The following system variables are unique to polygonal index data sets:
|C_ROW||starting chain from which a polygon can be dynamically traversed and closed. This chain is sometimes referred to as the seed chain for the polygon. Any chain on a polygon's boundary can be the seed chain.|
|FLAGS||control flag for polygons.|
|NC||number of chains in the polygon boundary.|
Polygonal index data
sets are created with the POLYGONAL
INDEX statement in the GIS procedure. See POLYGONAL INDEX Statement for more information about using the GIS
procedure to create polygonal index data sets.
SAS/GIS software uses SAS catalog entries to store metadata for
the spatial database--that is, information about the spatial data values
in the spatial data sets. SAS/GIS spatial databases use the following entry
A spatial entry is a SAS catalog entry of type GISSPA that identifies the spatial data sets for a given spatial database and defines relationships between the variables in those data sets.
SAS/GIS software supports simple spatial entries and merged spatial entries as follows:
For example, you may have two spatial databases that contain the county boundaries of adjoining states. You can build a merged spatial entry that references both states and then you can view a single map containing both states' counties. Otherwise, you would have to import a new map that contains the two states' counties. This new map would double your spatial data storage requirements.
Spatial entries are created and modified by using the SPATIAL statement in the GIS procedure.
Note: You can also create a new spatial entry by making the following selections from the GIS Map Window's menu bar:
The following other statements in the GIS procedure also update the information in the spatial entry:
You can view a formatted report of the contents of a spatial entry by submitting a SPATIAL CONTENTS statement in the GIS procedure.
See SPATIAL Statement
for more information about using the GIS procedure to create, modify, or view
the contents of spatial entries.
A coverage entry is a SAS catalog entry of type GISCOVER that defines the subset, or coverage, of the spatial data that is available to a map. SAS/GIS maps refer to coverages rather than directly to the spatial data.
A coverage entry contains the following elements:
The WHERE clause binds the coverage entry to the spatial data sets that it subsets. The WHERE clause is checked for compatibility with the spatial data when the coverage entry is created and also whenever a map that uses the coverage entry is opened.
These maximum and minimum coordinates are evaluated when the coverage is created. The GIS procedure's COVERAGE CREATE statement reads the matching chains and determines the extents from the chains' XMIN, YMIN, XMAX, and YMAX variables. If you make changes to the chains, nodes, and details data sets that affect the coverage extents, you should use the COVERAGE UPDATE statement to update the bounding extent values.
Multiple coverage entries can refer to the same spatial entry to create different subsets of the spatial data for different maps. For example, you could define a series of coverages to subset a county into multiple sales regions according to the block groups that are contained in each of the regions. The county would still be in a single spatial database that is represented by the chains, nodes, and details data sets and the controlling spatial entry.
Coverage entries are created and modified by using the COVERAGE statement in the GIS procedure. You can view a formatted report of the contents of a coverage entry by submitting a COVERAGE CONTENTS statement in the GIS procedure. (The contents report for a coverage entry also includes all the contents information for the root spatial entry as well.)
See COVERAGE Statement
for more information about using the GIS procedure to create, modify, or view
the contents of coverage entries.
A layer entry is a SAS catalog entry of type GISLAYER that defines the set of features that compose a layer in the map. A layer entry contains the following elements:
The WHERE clause binds the layer entry to the spatial data even though it is stored in a separate entry. The layer is not bound to a specific spatial entry, just to those entries representing the same type of data. Therefore, a layer that is created for use with data that were imported from a TIGER file can be used with data that were imported from any TIGER file. The WHERE clause is checked for compatibility with spatial data when the layer entry is created and also whenever a map that uses the layer entry is opened.
Note: When defining area layers, you can specify
a composite association as an alternative to specifying an explicit WHERE
clause. However, the layer entry stores the WHERE clause that is implied by
the composite association. For example, if you specify STATE as the defining
composite association for a layer, and the STATE composite association has
the variable association VAR=(LEFT=STATEL,RIGHT=STATER), then the implied
WHERE clause that is stored in the layer entry is WHERE STATEL NE STATER.
A map entry is a SAS catalog entry of type GISMAP. Map entries are the controlling entries for SAS/GIS maps because they tie together all the information that is needed to display a map. A map entry contains the following elements:
Map entries are created by using the MAP CREATE statement in the GIS procedure. However, much of the information that is stored in the map entry is specified interactively in the GIS Map window.
You can view a formatted report of the contents of a map entry by submitting a MAP CONTENTS statement in the GIS procedure. (The contents report for a map entry includes all the contents information for the spatial, coverage, and layer entries as well.)
See MAP Statement for details about the items that can be specified with the GIS procedure. See Chapter 10, "SAS/GIS Windows" in the SAS/GIS Software: Usage and Reference, Version 6 for details about the items that can be specified interactively in the GIS Map window.
Top of Page
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.