Creates three-dimensional scatter plots using values of three numeric variables from the input data set.
| Requirements: |
Exactly one plot request is required.
|
| Global statements: |
FOOTNOTE, TITLE
|
| Alias: |
SCAT
|
The SCATTER statement specifies one plot request that identifies the three numeric variables to plot. This statement automatically
You can use statement options to modify any of the three plot axes as well as the general appearance of the graph, control the
viewing angle, and specify characteristics for reference lines. In addition, if the needles drawn from the data points to the base plane complicate a graph, you can suppress
them.
You can use global statements to add text to the graph, and an Annotate data set to enhance the plot.
|
SCATTER plot-request </
option(s)>;
|
plot-request must be
option(s) can be one or more options from any or all of the following
categories:
-
y*x=z
-
specifies three numeric variables from the input data set:
-
y
-
is one of the variables that is plotted on the horizontal (x-y) plane.
-
x
-
is another of the variables that is plotted on the horizontal (x-y) plane.
-
z
-
is the variable that is plotted on the vertical (z)
axis.
The SCATTER statement does not require a full grid of observations for the horizontal
variable.
Options in a SCATTER statement affect
all graphs that are produced by that statement. You can specify as many options as you want and list them in any order.
-
ANNOTATE=Annotate-data-set
ANNO=Annotate-data-set
-
specifies a data set to annotate plots that are produced by the SCATTER statement.
-
CAXIS=axis-color
-
specifies a color for axis lines and tick marks. By default, axes display in the second color in the colors list.
-
COLOR='data-point-color'
| data-point-color-variable
-
specifies a color name or a character variable in the input data set whose values are color names. These color values determine the color or colors of the shapes
that represent a plot's data points. Color values must be valid color names for the device that is used. By default, plot shapes display in the third color in the current colors
list.
If you specify COLOR='data-point-color', all shapes are drawn in that color. For example, the procedure uses BLUE for all graph shapes when you specify
color='blue'
If you specify COLOR=data-point-color-variable, the color of the symbol is determined by the value of the color variable for that observation. For example,
the procedure uses the value of the variable CLASS as the color for each data point shape when you specify
color=class
Using COLOR=data-point-color-variable enables you to assign different colors to the shapes to classify data.
-
CTEXT=text-color
-
specifies a color for all text on the axes, including tick mark values and axis labels. If you omit this option, a color specification is searched for in this order:
-
the CTEXT= option in a GOPTIONS statement
-
the default, the first color in the colors
list.
-
DESCRIPTION='entry-description'
DES='entry-description'
-
specifies the description of the catalog entry for the chart. The maximum length for entry-description is 40 characters. The description does not appear
on the chart. By default, the procedure assigns a description of the form SCATTER OF y*x=z, where y*x=z is the request that is specified in the SCATTER
statement.
-
GRID
-
draws reference lines at the major tick marks on all axes.
-
NAME='entry-name'
-
specifies the name of the catalog entry for the graph. The maximum length for entry-name is eight characters. The default name is G3D. If the specified
name duplicates the name of an existing entry, SAS/GRAPH software adds a number to the duplicate name to create a unique
entry, for example, G3D1.
-
NOAXIS
NOAXES
-
specifies that a plot have no axes, axis labels, or tick mark values.
-
NOLABEL
-
specifies that a plot have no axis labels or tick mark values. Use this option if you want to generate axis labels and tick mark values with an Annotate data
set.
-
NONEEDLE
-
specifies that a plot have no lines that connect the shapes representing data points to the x-y plane. The NONEEDLE option option has no effect when
SHAPE='PILLAR' or SHAPE='PRISM'.
-
ROTATE=angle-list
-
specifies one or more angles at which to rotate the x-y plane about the perpendicular z axis. The units for angle-list are degrees. By
default, ROTATE=70. Angle-list is either an explicit list of values, or a starting and an ending value with an interval increment, or a combination of both forms:
The values specified in angle-list can be negative or positive and can be larger than 360°. For example, a rotation angle of
45° can also be expressed
rotate=405
rotate=-315
You can specify a sequence of angles to produce separate graphs for each angle. The angles that are specified in the ROTATE= option are paired with any angles that are
specified with the TILT= option. If one option contains fewer values than the other, the last value in the shorter list is paired with the remaining values in the longer list.
-
SHAPE='symbol-name' |
shape-variable
-
specifies a symbol name or a character variable whose values are symbol names.
Symbols represent a scatter plot's data points. By default, SHAPE='PYRAMID'.
Values for symbol-name are
| BALLOON |
DIAMOND |
PRISM |
| CLUB |
FLAG |
PYRAMID |
| CROSS |
HEART |
SPADE |
| CUBE |
PILLAR |
SQUARE |
| CYLINDER |
POINT |
STAR. |
Scatter Plot Symbols
illustrates these symbol types with needles.
Scatter Plot Symbols
If you specify
SHAPE='symbol-name', all data points are drawn in that shape. For example, the procedure draws all data points as balloons when you specify
shape='balloon'
If you specify SHAPE=shape-variable, the shape of the data point is determined by the value of the shape variable for that observation. For example, the
procedure uses the value of the variable CLASS for a particular observation as the shape for that data point when you specify
shape=class
Using SHAPE=shape-variable enables you to assign different shapes to the data points to classify data.
-
SIZE=symbol-size | size-variable
-
specifies either a constant or a numeric variable, the values of which determine the size of symbol shapes on the scatter
plot.
If you specify SIZE=symbol-size, all data points are drawn in that
size. For example, if you specify SIZE=3, the procedure draws all symbol shapes three times the normal size. By default, SIZE=1.0. The units are in default symbol size.
If
you specify SIZE=size-variable, the size of the data point is determined by the value of the size variable for that observation. For example, when you specify SIZE=CLASS, the procedure
uses the value of the variable CLASS for each observation as the size of that data point. If you use SIZE=size-variable, you can assign different sizes to the data points to classify
data.
-
TILT=angle-list
-
specifies one or more angles at which to tilt the graph toward you. The units for angle-list are degrees. By default, TILT=70. Angle-list
is either an explicit list of values, or a starting and an ending value with an interval increment, or a combination of both forms:
The values that are specified in angle-list must be 0 through 90.
You can specify a sequence of
angles to produce separate graphs for each angle. The angles that are specified in the TILT= option are paired with any angles that are specified with the ROTATE= option. If one option contains
fewer values than the other, the last value in the shorter list is paired with the remaining values in the longer list.
-
XTICKNUM=number-of-ticks
YTICKNUM=number-of-ticks
ZTICKNUM=number-of-ticks
-
specify the number of major tick marks that are located on a plot's x,{ it y}, or z axis, respectively. The value for n must
be 2 or greater. By default, XTICKNUM=4, YTICKNUM=4, and ZTICKNUM=4.
-
ZMAX=max-value
ZMIN=min-value
-
specify the maximum and minimum values that are displayed on a plot's z axis. By default, the z axis is defined by the minimum and maximum
z values in the data. You can use the ZMIN= and ZMAX= options to extend the z axis beyond this range. The value that is specified by ZMAX= must be greater than that
specified by ZMIN=. If you specify a ZMAX= or ZMIN= value within the actual range of the z variable values,
the plot's data values are clipped at the specified level.
Use the COLOR=, SHAPE=, and SIZE= options to change the appearance of your scatter plot or
to classify data using color, shape, size, or any combination of these features. Scatter Plot Symbols illustrates the shape names that you can
specify
in the SHAPE= option.
For example, to make all of the data points red balloons at twice the normal size, use
scatter y*x=z /color='red' shape='balloon' size=2;
To size your points according to the values of the variable TYPE in your input data set, use
scatter y*x=z / size=type;
For an example, see Using Shapes in Scatter Plots.
You can approximate an overlaid scatter plot by graphing multiple values for the vertical (z) variables for a single (x, y) position in a single scatter plot. To
do this, add
a small value to the value of one of the horizontal variables (x or y) to give the observation a slightly different (x, y) position. Thus, you
enable the procedure to plot both values of the vertical (z) variable. Represent each different vertical (z) variable with a different symbol, size, or color. The resulting
plot appears to be multiple plots overlaid on the same axes.
For example, suppose you want to graph a data set that contains two values for the vertical variable Z for each
combination of variables X and Y. You could produce the original data set with a DATA step like this:
data planes;
input x y z shape $;
datalines;
1 1 1 PRISM
1 2 1 PRISM
1 3 1 PRISM
2 1 1 PRISM
2 2 1 PRISM
2 3 1 PRISM
3 1 1 PRISM
3 2 1 PRISM
3 3 1 PRISM
1 1 2 BALLOON
1 2 2 BALLOON
1 3 2 BALLOON
2 1 2 BALLOON
2 2 2 BALLOON
2 3 2 BALLOON
3 1 2 BALLOON
3 2 2 BALLOON
3 3 2 BALLOON
;
The SHAPE variable is assigned a different value for each different Z value for a single combination of X and Y values.
Ordinarily, the SCATTER
statement only plots the Z value for the last observation for a single combination of X and Y. However, you can use a DATA step to assign a slightly different x, y position
to all observations where Z is greater than 1:
data planes2;
set planes;
if z > 1 then x = x + .000001;
run;
Then you can use a SCATTER statement to produce a plot like the one in Simulated Overlaid Scatter Plot:
proc g3d data=planes2;
scatter x*y=z / zmin=0 shape=shape;
run;
quit;
Simulated Overlaid Scatter Plot
Although you can use the SCATTER statement's ROTATE option to alter the view of a plot and
therefore
the general orientation to axes values, you cannot use SCATTER statement options to reverse axis values for one of the plot variables. To do this, you can multiply that variable's values by -1 to
reverse the values themselves, which has the result of reversing the axis when those values are used to generate a plot. You should then use PROC FORMAT to define a format that displays the variable's
values as they exist in the original data.
For example, the following code generates the scatter plot shown in
Default Y-axis Order:
data original;
input y x z;
datalines;
-1.15 1 .01
-1.00 2 .02
1.20 3 .03
1.25 4 .04
1.50 5 .05
2.10 1 .06
2.15 2 .07
2.20 3 .08
2.25 4 .09
2.30 5 .10
;
title1 'Default Y Axis Order';
/* default Y axis order */
proc g3d data=original;
scatter y * x = z;
run;
Default Y-axis Order
To reverse the Y axis in the plot that is shown in Default Y-axis Order, you can
write a DATA step like the following to reverse the Y values and, therefore, reverse the Y axis when the values are plotted:
data minus_y;
set original;
y=-y;
run;
The previous code creates the MINUS_Y data set by reading the ORIGINAL data set, and then multiplying the values of variable Y by -1. Although plotting Y values from the
MINUS_Y data set would reverse values on the Y axis, it would misrepresent the original data. Such a plot would label the axis with the negative-Y values. You can correct the problem by using PROC
FORMAT to display Y values as they are stored in the ORIGINAL data set:
proc format;
picture reverse
low - < 0 = '09.00'
0 < - high = '09.00' (prefix='-')
0 = '09.00';
run;
Here, the PICTURE statement defines a picture format named
REVERSE, which you can refer to in DATA and PROC steps by using the name followed by a period. A picture format is a template for printing numbers. The '09.00' specifications are digit
selectors that indicate which digits or columns in the variable values will display in output; columns that do not have a specified
digit selector will not be displayed in output. Thus, a picture format for displaying the values of variable Y needs a column for a minus sign, a column for units, and two columns for decimals. The
digit selector 0 specifies that no leading zeros will display in a column, and the digit selector 9 specifies that a leading zero will display in a column.
The PICTURE
statement defines this new picture format for three data ranges. The lowest value in the data up to but not including zero will display with no prefix, which means negative values will display without
a minus sign. All values above (but not including) zero to the highest value in the data will be displayed with the specified prefix, which in this case is a minus sign. Because zero is excluded from
both ranges, it is assigned its own picture with no prefix.
You can now assign the REVERSE format to the Y values from the MINUS_Y data set and use Y to generate a scatter
plot. The resulting plot displays Y's negative values without a prefix, and its positive values display with a minus sign prefix. This effectively represents Y values as they are stored internally in
the ORIGINAL data set, thus correcting the data misrepresentation that results from multiplying Y by -1.
The following code generates the scatter plot shown in
Reverse Y-axis Order:
title1 'Reverse Y Axis Order';
/* reverses order of default Y axis */
proc g3d data=minus_y;
format y reverse.;
scatter y * x = z;
run;
quit;
Reverse Y-axis Order
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.