The input data set must contain at least three numeric variables:
The procedure can process multiple vertical variables for each pair of horizontal variables that
you specify. If you specify more than one vertical variable, the G3GRID procedure performs a separate analysis and produces interpolated or smoothed values for each vertical variable. If more than one
observation in the input data set has the same values for both horizontal variables, x and y, a warning message is printed, and only the first such point is used in the
interpolation.
By default, the interpolation is performed after both variables are similarly scaled because the interpolation methods assume that the scales of
x and y are comparable.
In the GRID statement, you can name multiple vertical variables (z through z-n) and produce a data set that contains two horizontal variables and multiple
vertical variables. You can use the resulting data set to produce plots of the relationships of the two horizontal variables to different vertical variables.
If the points that are generated by the horizontal variables tend to lie along a curve, a poor interpolation or spline may result. In such cases, the vertical variable(s) and one of the horizontal
variables should be modeled as a function of the remaining horizontal variable. You can use a scatter plot of the two horizontal variables to help determine the appropriate
function.
If the horizontal variable points are collinear, the procedure interpolates the function as constant along lines perpendicular to the line in the plane that is
generated by the input data points.
The output data set contains the two horizontal variables, the interpolated or smoothed vertical variables, and the BY variables, if any. If the GRID statement's SMOOTH= option is used, the output
data set also contains a variable named _SMTH_, with a value equal to that of the smoothing parameter.
You can control both the number of x and y
values in the output data set and the values themselves. In addition, you can specify an interpolation method.
The G3GRID procedure can use one of three interpolation methods: bivariate interpolation (the default), spline interpolation, and smoothed spline interpolation.
Unless you specify the SPLINE option, the G3GRID procedure is an interpolation procedure. That is, it calculates z values for x, y points that are missing from
the input grid. The surface that is formed by the interpolated data passes precisely through the data points in the input data set.
This default method of interpolation
works best for fairly smooth functions with values given at uniformly distributed points in the plane. If the data points in the input data set are erratic, the default interpolated surface can be
erratic.
This default method is a modification of that described by Akima (1978). This method consists of
-
dividing
the plane into nonoverlapping triangles that use the positions of the available points
-
fitting a bivariate fifth degree polynomial within each
triangle
-
calculating the interpolated values by evaluating the polynomial at each grid point that falls in the
triangle.
The coefficients for the polynomial are computed based on
The estimates of the first and second derivatives are computed using the n nearest neighbors of the point, where
n is the number specified in the GRID statement's NEAR= option. A Delauney triangulation (Ripley 1981, p. 38) is used for the default method. The coordinates of the triangles are
available in an output data set if requested by the OUTTRI= option in the PROC G3GRID statement.
If you specify the SPLINE option, a method is used that produces an interpolation or smoothing that is optimally smooth in a certain sense (Harder and Desmarais 1972, Meinguet 1979). The surface that
is generated can be thought of as one that would be formed if a stiff, thin metal plate were forced through or near the given data points. For large data sets, this method is substantially more
expensive than the default method.
The function u, formed when you specify the SPLINE option, is determined by letting
and
where
The coefficients c1, c2,...,
cn and d1, d2, d3 of this polynomial are determined by
these equations:
and
where
-
E
-
is the n × n matrix E(ti , tj
)
-
I
-
is the n × n identity matrix
-
![[lambda]](../common/images/lambdal.gif)
-
is the smoothing parameter that is specified in the SMOOTH= option
-
c
-
is (c1 ,..., cn )
-
z
-
is (z1 ,..., zn )
-
d
-
is (d1, d2,
d3)
-
T
-
is the n × 3 matrix whose ith row is (1, xi,
yi).
See Wahba (1979) for more detail.
To produce a smoothed spline, you can use the GRID statement's SMOOTH= option with the SPLINE option. The value or values specified in the SMOOTH= option are substituted for
in the equation
that is described in Spline Interpolation. A smoothed spline trades closeness to the original data points for smoothness. To find
a value that produces the best balance between smoothness and fit to the original data, you can
try several values for the SMOOTH= option.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.