Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The TREE Procedure

Example 66.1: Mammals' Teeth

The following data give the numbers of different kinds of teeth for a variety of mammals. The mammals are clustered by average linkage using the CLUSTER procedure (Output 66.1.1). The PROC TREE statement uses the average-linkage distance as the height axis, which is the default, and creates a horizontal high-resolution graphics tree (Output 66.1.2).

   data teeth;
      title 'Mammals'' Teeth';
      input mammal $ 1-16 @21 (v1-v8) (1.);
      label V1='Right Top Incisors'
            V2='Right Bottom Incisors'
            V3='Right Top Canines'
            V4='Right Bottom Canines'
            V5='Right Top Premolars'
            V6='Right Bottom Premolars'
            V7='Right Top Molars'
            V8='Right Bottom Molars';
      datalines;
   Brown Bat           23113333
   Mole                32103333
   Silver Hair Bat     23112333
   Pigmy Bat           23112233
   House Bat           23111233
   Red Bat             13112233
   Pika                21002233
   Rabbit              21003233
   Beaver              11002133
   Groundhog           11002133
   Gray Squirrel       11001133
   House Mouse         11000033
   Porcupine           11001133
   Wolf                33114423
   Bear                33114423
   Raccoon             33114432
   Marten              33114412
   Weasel              33113312
   Wolverine           33114412
   Badger              33113312
   River Otter         33114312
   Sea Otter           32113312
   Jaguar              33113211
   Cougar              33113211
   Fur Seal            32114411
   Sea Lion            32114411
   Grey Seal           32113322
   Elephant Seal       21114411
   Reindeer            04103333
   Elk                 04103333
   Deer                04003333
   Moose               04003333
   ;
   options pagesize=60 linesize=110;

   proc cluster method=average std pseudo noeigen outtree=tree;
      id mammal;
      var v1-v8;
   run;

   proc tree graphics horizontal;
   run;

Output 66.1.1 displays the information on how the clusters are joined. For example, the cluster history shows that the observations Wolf and Bear form cluster 29, which is merged with Raccoon to form cluster 11.

Output 66.1.1: Output from PROC CLUSTER

Mammals' Teeth

The CLUSTER Procedure
Average Linkage Cluster Analysis

The data have been standardized to mean 0 and variance 1

Root-Mean-Square Total-Sample Standard Deviation = 1

Root-Mean-Square Distance Between Observations = 4

Cluster History
NCL Clusters Joined FREQ PSF PST2 Norm
RMS
Dist
T
i
e
31 Beaver Groundhog 2 . . 0 T
30 Gray Squirrel Porcupine 2 . . 0 T
29 Wolf Bear 2 . . 0 T
28 Marten Wolverine 2 . . 0 T
27 Weasel Badger 2 . . 0 T
26 Jaguar Cougar 2 . . 0 T
25 Fur Seal Sea Lion 2 . . 0 T
24 Reindeer Elk 2 . . 0 T
23 Deer Moose 2 . . 0  
22 Pigmy Bat Red Bat 2 281 . 0.2289  
21 CL28 River Otter 3 139 . 0.2292  
20 CL31 CL30 4 83.2 . 0.2357 T
19 Brown Bat Silver Hair Bat 2 76.7 . 0.2357 T
18 Pika Rabbit 2 73.2 . 0.2357  
17 CL27 Sea Otter 3 67.4 . 0.2462  
16 CL22 House Bat 3 62.9 1.7 0.2859  
15 CL21 CL17 6 47.4 6.8 0.3328  
14 CL25 Elephant Seal 3 45.0 . 0.3362  
13 CL19 CL16 5 40.8 3.5 0.3672  
12 CL15 Grey Seal 7 38.9 2.8 0.4078  
11 CL29 Raccoon 3 38.0 . 0.423  
10 CL18 CL20 6 34.5 10.3 0.4339  
9 CL12 CL26 9 30.0 7.3 0.5071  
8 CL24 CL23 4 28.7 . 0.5473  
7 CL9 CL14 12 25.7 7.0 0.5668  
6 CL10 House Mouse 7 28.3 4.1 0.5792  
5 CL11 CL7 15 26.8 6.9 0.6621  
4 CL13 Mole 6 31.9 7.2 0.7156  
3 CL4 CL8 10 31.0 12.7 0.8799  
2 CL3 CL6 17 27.8 16.1 1.0316  
1 CL2 CL5 32 . 27.8 1.1938  

Output 66.1.2: PROC TREE High-Resolution Graphics
treee1b.gif (5884 bytes)

As you look from left-to-right in the diagram in Output 66.1.2, objects and clusters are progressively joined until a single, all-encompassing cluster is formed at the right (or root) of the diagram. Clusters exist at each level of the diagram, and every vertical line connects leaves and branches into progressively larger clusters. For example, the five bats form a cluster at the 0.6 level, while the next cluster consists only of the mole. The observations Reindeer, Elk, Deer, and Moose form the next cluster at the 0.6 level, the mammals Pika through House Mouse are in the fourth cluster, The observations Wolf, Bear, and Raccoon form the fifth cluster, while the last cluster contains the observations Marten through Elephant Seal.

The following statements create the same tree with line printer graphics in a vertical orientation; the tree is displayed in Output 66.1.3.

   proc tree lineprinter;
   run;

Output 66.1.3: PROC TREE with the LINEPRINTER Option

                                                                                
                                                                                
                                                                                
                        Average Linkage Cluster Analysis                        
                                                                                
                            Name of Observation or Cluster                      
                                                                                
              S                                                                 
              i                                                                 
              l                         G                                 E     
              v                         r                                 l     
              e                         a   H           R                 e     
              r                         y   o           i                 p     
            B   P   H                 G   P u         W v     S G         h     
            r H i   o   R             r S o s         o e     e r     F S a     
            o a g R u   e             o q r e     R   l r     a e     u e n     
            w i m e s   i         R B u u c       a M v   W B   y J C r a t     
            n r y d e   n     M   a e n i u M     c a e O e a O   a o           
                      M d   D o P b a d r p o W B c r r t a d t S g u S L S     
            B B B B B o e E e o i b v h r i u o e o t i t s g t e u g e i e     
            a a a a a l e l e s k i e o e n s l a o e n e e e e a a a a o a     
            t t t t t e r k r e a t r g l e e f r n n e r l r r l r r l n l     
    A  1.5 +                                                                    
    v      |                                                                    
    e      |                                                                    
    r      |                                                                    
    a      |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    g      |XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    e    1 +XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
           |XXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    D      |XXXXXXXXXXX XXXXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    i      |XXXXXXXXXXX XXXXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    s      |XXXXXXXXX . XXXXXXX XXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXX     
    t      |XXXXXXXXX . XXXXXXX XXXXXXXXXXXXX XXXXX XXXXXXXXXXXXXXXXXXXXXXX     
    a  0.5 +XXXXXXXXX . XXX XXX XXXXXXXXXXX . XXXXX XXXXXXXXXXXXXXXXX XXXXX     
    n      |XXXXXXXXX . XXX XXX XXXXXXXXXXX . XXXXX XXXXXXXXXXXXX XXX XXXXX     
    c      |XXXXXXXXX . XXX XXX XXX XXXXXXX . XXX . XXXXXXXXXXX . XXX XXXXX     
    e      |XXX XXXXX . XXX XXX XXX XXXXXXX . XXX . XXXXX XXXXX . XXX XXX .     
           |. . . . . . XXX XXX . . XXX XXX . XXX . XXX . XXX . . XXX XXX .     
    B      |. . . . . . XXX XXX . . XXX XXX . XXX . XXX . XXX . . XXX XXX .     
    e    0 +. . . . . . XXX XXX . . XXX XXX . XXX . XXX . XXX . . XXX XXX .     
    t                                                                           
    w                                                                           


As you look up from the bottom of the diagram, objects and clusters are progressively joined until a single, all-encompassing cluster is formed at the top (or root) of the diagram. Clusters exist at each level of the diagram. For example, the unbroken line of Xs at the left-most side of the 0.6 level indicates that the five bats have formed a cluster. The next cluster is represented by a period because it contains only one mammal, Mole. Reindeer, Elk, Deer, and Moose form the next cluster, indicated by Xs again. The mammals Pika through House Mouse are in the fourth cluster. The observations Wolf, Bear, and Raccoon form the fifth cluster, while the last cluster contains the observations Marten through Elephant Seal.

The next statement sorts the clusters at each branch in order of formation and uses the number of clusters as the height axis. The resulting tree is displayed in Output 66.1.4.

   proc tree sort height=n horizontal;
   run;

Output 66.1.4: PROC TREE with SORT and HEIGHT= Options
treee1d.gif (5957 bytes)

Because the CLUSTER procedure always produces binary trees, the number of internal (root and branch) nodes in the tree is one less than the number of leaves. Therefore 31 clusters are formed from the 32 mammals in the input data set. These are represented by the 31 vertical line segments in the tree diagram, each at a different value along the horizontal axis.

As you examine the tree from left to right, the first vertical line segment is where Beaver and Groundhog are clustered and the number of clusters is 31. The next cluster is formed from Gray Squirrel and Porcupine. The third contains Wolf and Bear. Note how the tree graphically displays the clustering order information that was presented in tabular form by the CLUSTER procedure in Output 66.1.1.

The same clusters as in Output 66.1.2 and Output 66.1.3 can be seen at the six-cluster level of the tree diagram in Output 66.1.4, although the SORT and HEIGHT= options make them appear in a different order.

The following statements create these six clusters and display them in Output 66.1.5. The PROC TREE statement produces no output but creates an output data set indicating the cluster to which each observation belongs at the six-cluster level in the tree.

   proc tree noprint out=part nclusters=6;
      id mammal;
      copy v1-v8;
   proc sort;
      by cluster;
   proc print label uniform;
      id mammal;
      var v1-v8;
      format v1-v8 1.;
      by cluster;
   run;

Output 66.1.5: PROC TREE OUT= Data Set

CLUSTER=1

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Beaver 1 1 0 0 2 1 3 3
Groundhog 1 1 0 0 2 1 3 3
Gray Squirrel 1 1 0 0 1 1 3 3
Porcupine 1 1 0 0 1 1 3 3
Pika 2 1 0 0 2 2 3 3
Rabbit 2 1 0 0 3 2 3 3
House Mouse 1 1 0 0 0 0 3 3

CLUSTER=2

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Wolf 3 3 1 1 4 4 2 3
Bear 3 3 1 1 4 4 2 3
Raccoon 3 3 1 1 4 4 3 2

CLUSTER=3

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Marten 3 3 1 1 4 4 1 2
Wolverine 3 3 1 1 4 4 1 2
Weasel 3 3 1 1 3 3 1 2
Badger 3 3 1 1 3 3 1 2
Jaguar 3 3 1 1 3 2 1 1
Cougar 3 3 1 1 3 2 1 1
Fur Seal 3 2 1 1 4 4 1 1
Sea Lion 3 2 1 1 4 4 1 1
River Otter 3 3 1 1 4 3 1 2
Sea Otter 3 2 1 1 3 3 1 2
Elephant Seal 2 1 1 1 4 4 1 1
Grey Seal 3 2 1 1 3 3 2 2

CLUSTER=4

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Reindeer 0 4 1 0 3 3 3 3
Elk 0 4 1 0 3 3 3 3
Deer 0 4 0 0 3 3 3 3
Moose 0 4 0 0 3 3 3 3

CLUSTER=5

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Pigmy Bat 2 3 1 1 2 2 3 3
Red Bat 1 3 1 1 2 2 3 3
Brown Bat 2 3 1 1 3 3 3 3
Silver Hair Bat 2 3 1 1 2 3 3 3
House Bat 2 3 1 1 1 2 3 3

CLUSTER=6

mammal Right
Top Incisors
Right
Bottom
Incisors
Right
Top Canines
Right
Bottom
Canines
Right
Top Premolars
Right
Bottom
Premolars
Right
Top Molars
Right
Bottom
Molars
Mole 3 2 1 0 3 3 3 3

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.