| SAS/ACCESS Interface to IMS-DL/I: Reference |
In some ways the structures of IMS-DL/I databases and tabular
files (such as SAS data sets) are comparable, but in other ways they differ.
For example, database fields and data set variables are similar, and database
records are like data set observations because both contain data about one
entity. At the same time, however, database records differ from data set observations
because subsets of records can be accessed while you cannot access a subset
of a data set observation. The observation is stored and accessed as a unit.
A tabular file has nothing comparable to a segment.
The concept of data segments is one of the things that makes a hierarchical
database different from a SAS data set and other tabular files.
Consider banking data as an example. Customer information
maintained by a bank might include the following:
If this
information is stored in a tabular file, each item of information is a variable, and all of the variables for any given person comprise one
observation. You can visualize the layout of banking data in a tabular file
as shown in Tabular File Structure.
Tabular File Structure
The rows in the table represent
observations (customers), and the columns represent variables. The structure
of the file is such that a maximum number of variables for debits and credits
must be defined when the file is created. In Tabular File Structure there are variables for up to only three debits
and three credits per customer, which presents a problem if a customer has
more than three debit or credit transactions.
The same data can also be stored in an IMS-DL/I database
but would be structured very differently. For example, Hierarchical File Structure shows one way the banking information
could be structured in IMS-DL/I. The
sample database, called ACCTDBD, is used in this book and described in About the Example Data in This Book in Chapter
1, "Introducing the SAS/ACCESS Interface
to IMS-DL/I."
Hierarchical File Structure
Each block in the figure represents a segment
type, which is a grouping of related fields of data. There
are three levels in the ACCTDBD database hierarchy and seven segment types.
For each database record, the top or first level has only one segment, called
the root segment. The root segment in the ACCTDBD database is
called CUSTOMER; it contains fields with these data: Social Security number,
customer name, address, city, state, country, ZIP code, home phone, and work
phone. The segments under the root segment are dependent segments
called CHCKACCT, CHCKDEBT, CHCKCRDT, SAVEACCT, SAVEDEBT, and SAVECRDT. Each
of the dependent segments contains fields of data, as shown in the following
table.
|
DependentSegment |
Fields |
|
CHCKACCT |
checking account number, current balance, last statement
date, last statement balance |
|
CHCKDEBT |
checking account debit date and time, amount, description |
|
CHCKCRDT |
checking account credit date and time, amount, description |
|
SAVEACCT |
savings account number, current balance, last statement
date, last statement balance |
|
SAVEDEBT |
savings account debit date and time, amount, description |
|
SAVECRDT |
savings account credit date and time, amount, description |
The hierarchical database
structure
is useful for storing multiple occurrences of any given element of information,
especially if there are varying numbers of occurrences of the data for each
record.
Consider the dependent segment SAVEACCT, which contains
the following fields:
Different customers can have different
numbers of savings
accounts; some may have none, others may have two or three. If the data are
not segmented, there must be space in each customer's record for the maximum
number of savings accounts per customer. With the segmented structure, however,
it is possible to have one SAVEACCT segment occurrence for each savings account
a customer has. Any segment type can have an unlimited number of segment occurrences.
Although the segment types are predefined, the number of segment occurrences
is not predefined. Note that each occurrence of a root segment represents
a separate record.
Here is an example of how a segment type can have an
unlimited number of segment occurrences. A certain customer has two savings
accounts. In one month, the customer has two deposits for account number 111
and one deposit and two withdrawals for account number 222. Sample Record shows the customer's record.
Sample Record
Sample Record
shows eight segment occurrences within four (shaded) segment types.
The information in a hierarchical
database is subdivided or segmented according to a logical scheme. Moving
from top to bottom through the database, there is a relationship between the
segments. A segment that is hierarchically dependent on a segment one level
up in the hierarchy is said to be the child. The segment on which
it is dependent is the parent. CUSTOMER is a parent segment with
two children, CHCKACCT and SAVEACCT. CHCKACCT, in turn, is the parent of CHCKDEBT
and CHCKCRDT, and SAVEACCT is the parent of SAVEDEBT and SAVECRDT. Segments
that share a parent are called siblings; for example, CHCKACCT
and SAVEACCT are siblings. Multiple segment occurrences of one segment type
with the same parent occurrence are called twins. For example,
SAVEACCT 111 and SAVEACCT 222 are twins.
All dependent segments are children but are not necessarily
parents. The root segment (CUSTOMER), on the other hand, is a parent if any
dependent segments exist, but it is never a child. (It is possible to have
a database with no dependent segments, that is, with only one level, the root
segment.) In a hierarchical structure, there can be only one parent segment
for a child segment.
Segments can also be grouped by paths. Two segments belong to the same path if one is a dependent of
the other. You can access multiple segments in a path at the same time.
These relationships are shown in Segment Relationships.
Segment Relationships
-
Parents:
-
A, B, E1, E2
-
Children:
-
B, E1, E2, C1, C2, D, F1, F2, G, H
-
Twins:
-
C1 and C2, E1 and E2, F1 and F2
-
Siblings:
-
B, E1, and E2; C1, C2, and D; G and H
-
Paths:
-
A, B, and C1; A, B, and C2; A, B, and D;
A, E1, and F1; A, E1, and F2; A, E2, and G; A, E2, and H
You can navigate
one path of an IMS-DL/I database at
a time with the interface view engine in Version 7 of the SAS System. That
is, you can select items in one path of the database when creating a view
descriptor. Consider A Database Path,
which shows one path of data shaded. The SAS/ACCESS interface
processes each record occurrence from top to bottom and from left to right
following these rules:
-
The first occurrence of a root segment is processed
first.
-
Then, the first child of a root segment in the
defined path is processed before twins of the root segment.
-
Twins are processed after a child (if any) down
that path. Twins are processed in order of occurrence. Any child of a twin
is processed according to this rule.
-
After the child and twins are processed for that
one path, the next eligible root segment in the path is processed.
Note:
No siblings are
processed. ![[cautend]](../common/images/cautend.gif)
A Database Path
illustrates a path of data in a particular program view. The numbering indicates
the order of processing.
A Database Path
There are three types
of fields in the segments of an IMS-DL/I database:
-
A sequence field (or key field)
is a field that identifies and provides access to segments in a database.
A sequence field is defined to IMS-DL/I in the Database Description (DBD), which specifies characteristics of a database.
In some cases, a sequence field sequences twin segment occurrences in ascending
order, according to their sequence field values. For example, if the sequence
field of the CHCKACCT segment is ACNUMBER, twin CHCKACCT occurrences in a
given customer's record are ordered from the lowest to highest account number.
Root segments usually have a sequence field, but dependent segments do not
necessarily have them.
In a root segment, the sequence field also uniquely
identifies the record. In dependent segments, the sequence field can provide
unique identification, but this is not required. Root segments may or may
not be sequenced by the sequence field, depending on the IMS-DL/I access method
used to store the database.
-
A search field is defined to IMS-DL/I in the DBD and is used
to search through the database for particular values. For example, CUSTZIP
is defined as a search field in the CUSTOMER segment, permitting the SAS/ACCESS interface
to IMS-DL/I to search the database for records containing a specific ZIP code.
-
An undefined field is not defined to IMS-DL/I. All fields other
than sequence fields and search fields do not have to be defined in the DBD.
IMS-DL/I does not know the format of an undefined field and cannot search
for segments based on values in an undefined field. The format of an undefined
field is determined by the program that loads the database initially.
The CUSTOMER segment of the ACCTDBD database contains
examples of two kinds of fields. There are fields for Social Security number,
name, city, state, country, ZIP code, address, home phone, and work phone.
The Social Security number field is defined as the sequence field, meaning
that it uniquely identifies the record. The name field of CUSTOMER does not
uniquely identify a record because customer names may be duplicated. However,
because names can be used to search through the database, the name field is
defined as a search field, as are the address, city, state, country, ZIP code,
home phone, and work phone fields.
The sequence field of a root segment allows direct access
to the root segment. The sequence field of a dependent segment does not allow
direct access to the record, but IMS-DL/I finds segments faster when searching
on sequence fields rather than search fields.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.