## Terminology

*Basic Pareto Charts*

A basic Pareto chart (see Figure 26.1)
analyzes the unique values of a *process variable*,
which are referred to as *Pareto categories* or *levels*.
These values typically represent problems encountered during
some phase of a manufacturing or
service activity.
A basic vertical Pareto chart (as produced by the Pareto procedure's
VBAR statement) has one horizontal and two vertical axes. The
horizontal (or *category*) axis is displayed at the bottom of the
chart and lists the Pareto categories. The *primary
vertical axis* (or *frequency axis*) is displayed on
the left. The relative frequency
of each Pareto category is represented by a vertical bar whose
height is measured on the primary vertical axis. You can use
the SCALE= option to scale this axis in percent, count, or weight
units. The *secondary vertical axis* (or *cumulative percent
axis*) is displayed on the right.
This axis is scaled in cumulative percent units and is used to
read the *cumulative percent curve*. The height of each
point on the curve represents the percent of the total frequency
accounted for by the Pareto categories to the left of the point.
In a horizontal Pareto chart (as produced by the HBAR statement),
the category axis is displayed vertically on the left. Categories
appear in order of decreasing relative frequency from top to bottom.
The frequency axis appears at the top of the chart and the
cumulative percent axis is at the bottom.
The relative frequencies of the Pareto categories are represented by
horizontal bars. A point on the cumulative percent curve
represents the percent of the total frequency accounted for by the
Pareto categories above that point.
**Note:** For the sake of brevity, in this chapter the term *height*
is used to refer to the size of a bar as measured along the frequency
axis, whether the Pareto chart is oriented vertically or horizontally.

A *restricted Pareto chart* (see Figure 26.6)
displays only the *n* most frequently
occurring categories in a data set that contains *N*
categories, where *N*>*n*. The remaining *N*-*n* categories
are dropped or are merged into a single "other" category
created with the OTHER= option. The MAXCMPCT=, MAXNCAT=,
and MINPCT= options provide alternative methods for specifying *n*.
See the entries for these options in "Dictionary of Options"
.

A *weighted Pareto chart*
(see Example 29.8)
displays bars whose heights represent the weighted
frequencies of the categories.
Typical weights are the cost of
repair or the loss incurred by the customer.
The weight *W*_{i} for the
*i*^{ th} Pareto category is computed as

where *C*_{i} is the set of observations that make
up the *i*^{ th} category,
*w*(*u*) is the value of the weight variable in the
*u*^{ th} observation,
and *f*(*u*) is the value of the frequency variable in the
*u*^{ th} observation (taking if a FREQ= variable is not specified).
If SCALE=WEIGHT is specified,
the height of the bar
for the *i*^{ th} category is *W*_{i}.
If SCALE=PERCENT is specified,
the height of this bar is

where *N* is the total number of categories.

A *comparative Pareto chart* combines two or more Pareto
charts for the same process variable. The component charts
are displayed with uniform axes to facilitate comparison. The
observations represented by a components chart are referred
to as a *cell*. The framed areas for the component charts
are referred to as *tiles*.
In a *one-way comparative Pareto chart*,
each component chart corresponds to a different level of a single
classification variable specified with the CLASS= option.
The component charts are arranged in a stack or a row, as
illustrated in
Output 29.1.3,
Output 29.2.2, and
Output 29.2.3.
In a *two-way comparative Pareto chart*,
each component chart corresponds to a different combination
of levels of two classification variables specified with the
CLASS= option. The component charts are arranged in a matrix, as
illustrated in Output 29.2.4.
In any comparative Pareto chart there is a *key cell*,
in which the bars are in decreasing order and whose order is imposed
on all the other cells to achieve a uniform category axis.
By default, the key cell is the cell in the upper left corner, but
you can use the CLASSKEY= option to designate any other cell as
the key cell. In this case, the rows and columns of the comparative
chart will be rearranged so that the key cell appears in the upper
left. However, if you require the rows and columns in a particular
order, you can specify the NOKEYMOVE option in conjunction with
the CLASSKEY= option to suppress the rearrangement.
If you are creating your chart with a graphics device, you
can use the NROWS= and NCOLS= options to specify the numbers
of rows and columns in a comparative Pareto chart. By default,
NROWS=2 and NCOLS=1 for a one-way comparison and NROWS=2 and
NCOLS=2 for a two-way comparison. There is no upper limit to
the number of rows or columns that you can specify, but in
practice the limit is determined by the display area of your
graphics device. If the numbers of classification variable
levels exceed the NROWS= and NCOLS= values, the chart is created
on multiple screens or pages.
If the same set of Pareto categories does not occur in each
cell of a comparative Pareto chart, the categories are said to be
*unbalanced*. In this case, the procedure uses the following
convention to construct the uniform category axis. First, the
categories that occur in the key cell are arranged on the category
axis from left to right (top to bottom for a horizontal chart),
sorted in decreasing order of frequency,
with tied levels arranged in order of their formatted values.
The categories not in the key cell are assigned frequencies of
zero in the key cell, and they are arranged at the right (bottom) of the
category axis, where they are ordered by their formatted values.
This arrangement is simply a convention of the procedure and should
not be interpreted to mean that one category is more important
than another.
Whether the categories in the input data set are balanced or not,
the categories in the OUT= data set are always balanced.
The procedure balances this data set by assigning values of zero
to the _COUNT_ and _PCT_ variables as necessary.

Unbalanced categories present a special problem when
the MAXNCAT= option is used to restrict the number of
categories displayed on the chart. For instance,
suppose that you specify MAXNCAT=12 and there are
15 categories in all, 10 of which occur in the key cell.
Since there is no unambiguous method for selecting two
of the remaining five categories to complete the
restricted list, the procedure reduces
the restricted list to the categories that occur in the
key cell and displays only those 10 categories. A
warning message is issued in the SAS log.

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.