|Windows specifics:||Sort utilities available; SORTSIZE= and TAGSORT statement options|
|Creating Your Own Collating Sequences|
|PROC SORT <option(s)> <collating-sequence-option>;|
Note: This is a simplified
version of the SORT procedure syntax.
For the complete syntax and its explanation, see the SORT procedure in SAS Procedures
The SORT procedure sorts observations in a SAS data set by one or more character or numeric variables, either replacing the original data set or creating a new, sorted data set. By default under Windows, the SORT procedure uses the ASCII collating sequence.
The SORT procedure uses the sort utility specified by
the SORTPGM system option. Under Windows, although all three SORTPGM keywords
(HOST, BEST, and SAS) are accepted for compatibility, the SAS sort is always
used. You can use all the options available to the SAS sort utility, such
as the SORTSEQ and NODUPKEY options. For a complete list of all options available,
see the SORT procedure in SAS Procedures Guide.
Under Windows, you can use the SORTSIZE= option in the PROC SORT statement to limit the amount of memory available to the SORT procedure. This option may reduce the amount of swapping the SAS System must do to sort the data set. If PROC SORT needs more memory than you specify, it creates a temporary utility file in your SASWORK directory to store the data in. The SORT procedure's algorithm can swap data more efficiently than Windows can.
The syntax of the SORTSIZE= option is as follows:
where memory-specification can be one of the following:
|n||specifies the amount of memory in bytes.|
|nK||specifies the amount of memory in 1-kilobyte multiples.|
|nM||specifies the amount of memory in 1-megabyte multiples.|
The default SAS configuration file sets this option to 2MB using the SORTSIZE= system option. A value of 2MB is optimal for all memory configurations. If your machine has more than 12 MB of physical memory and you are sorting large data sets, setting the SORTSIZE= option to a value greater than 2M might improve performance.
You can override the default value of the SORTSIZE= system option by specifying a different SORTSIZE= value in the PROC SORT statement, or by submitting an OPTIONS statement that sets the SORTSIZE= system option to a new value.
The SORTSIZE= option is also discussed in Improving Performance of the SORT Procedure.
The TAGSORT option
in the PROC SORT statement is useful when there may not be enough disk space
to sort a large SAS data set. When you specify the TAGSORT option,
only sort keys (that is, the variables specified in the BY statement) and
the observation number for each observation are stored in the temporary files.
The sort keys, together with the observation number, are referred to as tags. At the completion of the sorting process, the tags are used
to retrieve the records from the input data set in sorted order. Thus, in
cases where the total number of bytes of the sort keys is small compared with
the length of the record, temporary disk use is reduced considerably. You
should have enough disk space to hold another copy of the data (the output
data set) or two copies of the tags, whichever is greater. Note that while
using the TAGSORT option may reduce
temporary disk use, the processing time may be much higher. However, on PCs
with limited available disk space, the TAGSORT option may allow sorts to be
performed in situations where they would otherwise not be possible.
If you want to provide your own collating sequences or change a collating sequence that has been provided for you, use the TRANTAB procedure to create or modify translate tables. For complete details on the TRANTAB procedure, see SAS Procedures Guide. When you create your own translate tables, they are stored in your PROFILE catalog and they override any translate tables by the same name that are stored in the HOST catalog.
Note: System managers can modify the HOST catalog by copying newly created tables
from the SASUSER.PROFILE catalog to
the HOST catalog. Then all users can access the new or modified translate
If you want to see the names of the collating sequences stored in the HOST catalog (using the SAS Explorer), submit the following statement:
dm 'catalog sashelp.host' catalog;Alternatively, you can select the View pull-down menu, then select the
Librariesitem, then double-click on the SASHELP library, and then double-click on the HOST catalog. In batch mode, you can use the following statements to generate a list of the contents of the HOST catalog:
proc catalog catalog=sashelp.host; contents; run;Entries of type TRANTAB are the collating sequences.
If you want to see the contents of a particular translate table, use the following statements:
proc trantab table=table-name; list; run;The contents of the collating sequence are displayed in the SAS log.
Top of Page
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.