Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The FASTCLUS Procedure

Overview

The FASTCLUS procedure performs a disjoint cluster analysis on the basis of distances computed from one or more quantitative variables. The observations are divided into clusters such that every observation belongs to one and only one cluster (the clusters do not form a tree structure as they do in the CLUSTER procedure). If you want separate analyses for different numbers of clusters, you must run PROC FASTCLUS once for each analysis.

The FASTCLUS procedure can use an Lp (least pth powers) clustering criterion (Spath 1985, pp. 62 -63) instead of the least-squares (L2) criterion used in k-means clustering methods. The LEAST=p option specifies the power p to be used. Using the LEAST= option increases execution time since more iterations are usually required. Values of p less than 2 reduce the effect of outliers on the cluster centers compared with least-squares methods; values of p greater than 2 increase the effect of outliers.

The FASTCLUS procedure is intended for use with large data sets, with 100 or more observations. With small data sets, the results may be highly sensitive to the order of the observations in the data set. PROC FASTCLUS produces brief summaries of the clusters it finds. For more extensive examination of the clusters, you can request an output data set containing a cluster membership variable.


Background

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.