Chapter Contents
Chapter Contents
The VARIOGRAM Procedure

Theoretical and Computational Details of the Semivariogram

The basic starting point in computing the semivariogram is the enumeration of pairs of points for the spatial data. Figure 70.11 shows a spatial domain in which a set of measurements are made at the indicated locations. Two points P1 and P2, with coordinates (x1,y1), (x2,y2), are selected for illustration. A vector, or directed line segment, is drawn between these points. This pair is then categorized first by orientation of this directed line segment and then by its length. That is, the pair P1P2 is placed into an angle and distance class.

vard1b.gif (1491 bytes)

Figure 70.11: Selection of Points P1 and P2 in Spatial Domain

Angle Classification

Suppose you specify NDIR=3 in the COMPUTE statement in PROC VARIOGRAM. This results in three angle classes defined by midpoint angles between 0o and 180o: 0^o +- \delta \theta, 60^o +- \delta \theta, and 120^o +- \delta \theta, where \delta \theta is the angle tolerance. If you do not specify an angle tolerance using the ATOL= option in the COMPUTE statement, the following default value is used.
\delta \theta = \frac{180^o}{2 x NDIR}

For three classes, \delta \theta = 30^o.When the example directed line segment P1P2 is superimposed on the coordinate system showing the angle classes, its angle, measured clockwise from north, is approximately 45o. In particular, it falls within [60^o - \delta \theta, 60^o + \delta \theta)= 
[30^o,90^o), the second angle class. See Figure 70.12.

vard1c.gif (2463 bytes)

Figure 70.12: Selected Pair P1P2 Falls within the Second Angle Class

Note that if the designated points P1 and P2 are labeled in the opposite order, the orientation is in a reciprocal direction, that is, approximately 225o for the point pair instead of approximately 45o. This does not affect angle class selection; the angle classes [60^o - \delta \theta, 60^o + \delta \theta) and [240^o - \delta \theta, 240^o + \delta \theta) are the same.

If you specify an angle tolerance less than the default, for example, ATOL=15o, some point pairs might be excluded. For example, the selected point pair P1P2 in Figure 70.12, while closest to the 60o axis, might lie outside [60-\delta \theta,60+\delta \theta)=
[45^o,75^o). In this case, the point pair P1P2 would be excluded from the variogram computation.

On the other hand, you can specify an angle tolerance greater than the default. This can result in a point pair being counted in more than one angle class. This has a smoothing effect on the variogram and is useful when there is a small amount of data available.

An alternative way to specify angle classes and angle tolerances is with the DIRECTIONS statement. The DIRECTIONS statement is useful when angle classes are not equally spaced. When you specify the DIRECTIONS statement, you should also specify the angle tolerance. The default value of the angle tolerance is 45o when a DIRECTIONS statement is used instead of the NDIRECTIONS= option in the COMPUTE statement. This may not be appropriate for a particular set of angle classes. See the "DIRECTIONS Statement" section for more details on the DIRECTIONS statement.

Distance Classification

Next, the distance class for the point pair P1P2 is determined. The directed line segment P1P2 is superimposed on the coordinate system showing the distance or lag classes. These classes are determined by the LAGD= specification in the COMPUTE statement. Denoting the length of the line segment by | P1P2 | and the LAGD value by \Delta, the lag class L is determined by

L(P_1P_2) = \lfloor \frac{| P_1P_2 | + .5}{\Delta} \rfloor
where \lfloor x \rfloor denotes the largest integer \le x.

When the directed line segment P1P2 is superimposed on the coordinate system showing the distance classes, it is seen to fall in the first lag class; see Figure 70.13 for an illustration for \Delta=1.

vard1d.gif (1564 bytes)

Figure 70.13: Selected Pair P1P2 Falls within the First Lag Class

Because pairwise distances are positive, lag class zero is smaller than lag classes 1, ... , MAXLAG-1. For example, if you specify LAGD=1.0 and MAXLAG=10, and you do not specify a LAGTOL= value in the COMPUTE statement in PROC VARIOGRAM, the ten lag classes generated by the preceding equation are

[0,.5), [.5,1.5), [1.5,2.5), ... , [8.5,9.5)

This is because the default lag tolerance is one-half the LAGD= value, resulting in no gaps between the distance class intervals. This is shown in Figure 70.14.


vard1e.gif (823 bytes)

Figure 70.14: Lag Distance Axis Showing Lag Classes

On the other hand, if you do specify a distance tolerance with the DTOL= option in the COMPUTE statement, a further check is performed to see if the point pair falls within this tolerance of the nearest lag. In the preceding example, if you specify LAGD=1.0 and MAXLAG=10 (as before) and also specify LAGTOL=0.25, the intervals become

[0,0.25), [0.75,1.25), [1.75,2.25), ... , [8.75,9.25)

Note that this specification results in gaps in the lag classes; a point pair P1P2 might fall, for example, in the interval

| P_1P_2 | \in [1.25,1.75)
and hence be excluded from the semivariogram calculation. The maximum LAGTOL= value allowed is half the LAGD= value; no overlap of the distance classes is allowed.

Bandwidth Restriction

Because the areal segments generated from the angle and distance classes increase in area as the lag distance increases, it is sometimes desirable to restrict this area (Duetsch and Journel 1992, p. 45). If you specify the BANDW= option in the COMPUTE statement, the lateral, or perpendicular, distance from the axis defining the angle classes is fixed.

For example, suppose two points P3, P4 are picked from the domain in Figure 70.11 and are superimposed on the grid defining distance and angle classes, as shown in Figure 70.15.


vard1f.gif (2015 bytes)

Figure 70.15: Selected Pair P3P4 Falls Outside Bandwidth Limit

The endpoint of vector P3P4 falls within the angle class around 60o and the 5th lag class; however, it falls outside the restricted area defined by the bandwidth. Hence, it is excluded from the semivariogram calculation.

Finally, a pair PiPj that falls in a lag class larger than the value of the MAXLAG= option is excluded from the semivariogram calculation.

From this description, it is clear that the number of pairs within each angle/distance class is strongly affected by the angle and lag tolerances. Since it is desirable to have the maximum number of point pairs within each class, the angle tolerance and the distance tolerance should usually be the default values.

Semivariogram Computation

With the classification of a point pair PiPj into an angle/distance class, as shown in the preceding section, the semivariogram computation proceeds as follows.

Denote all pairs PiPj belonging to angle class [\theta_k-\delta \theta_k,\theta_k+\delta \theta_k)and distance class L=L(PiPj) by N(\theta_k,L). For example, in the preceding illustration, P1P2 belongs to N(60o,1).

Let | N(\theta_k,L) | denote the number of such pairs. Let Vi, Vj be the measured values at points Pi, Pj. The component of the standard (or method of moments) semivariogram corresponding to angle/distance class N(\theta_k,L) is given by

2\gamma(h_k) = \frac{1}{| N(\theta_k,L) |}\sum_{P_iP_j \in N(\theta_k,L)}(V_i-V_j)^2
where hk is the average distance in class N(\theta_k,L); that is,

h_k = \frac{1}{| N(\theta_k,L) |}\sum_{P_iP_j \in N(\theta_k,L)}| P_iP_j |

The robust version of the semivariogram, as suggested by Cressie (1993), is given by

2\bar{\gamma}(h_k) = \frac{\Psi^4(h_k)}{0.457 + 0.494/N(\theta_k,L)}

\Psi(h_k) = \frac{1}{N(\theta_k,L)}\sum_{P_iP_j \in N(\theta_k,L)}(V_i-V_j)^{\frac{1}2}

This robust version of the semivariogram is computed when you specify the ROBUST option in the COMPUTE statement in PROC VARIOGRAM.

PROC VARIOGRAM computes and writes to the OUTVAR= data set the quantities h_k, \theta_k, L, N(\theta_k,L), \gamma(h), and \bar{\gamma}(h).

Chapter Contents
Chapter Contents

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.