| SAS/ACCESS Software for PC File Formats: Reference |
This chapter introduces
SAS System users to XLS files. It focuses on the terms and concepts that
help you use the SAS/ACCESS interface
and includes descriptions of
XLS files contain data in the form of Microsoft Excel
spreadsheets. Unless otherwise noted, Excel 4 and Excel 5 files are referred
to collectively throughout this report as XLS. Excel 5 files are not supported
under OS/2.
Note:
The ACCESS and DBLOAD procedures do not support
files in the Excel 97 (Version 8) format. However, you can still access Excel
97 files through the SAS Import/Export facility. From a SAS session's PROGRAM
EDITOR window, select the File
menu and then select the Import Data
or Export Data item. Information
about how to import and export Excel data is available from the [Help] button. To write code to import or export Excel data, refer to the
IMPORT or EXPORT procedure description in the SAS Procedures Guide. ![[cautend]](../common/images/cautend.gif)
Various
software products, such as the Microsoft Excel spreadsheet, enable you to
use spreadsheet or database files to enter, organize, and perform calculations
on data. Spreadsheets are most often used for general ledgers, income statements,
and other types of financial record keeping. Database files also enable you
to organize related information, such as the data in an accounts-receivable
journal.
In spreadsheets, the data are organized according to
certain relationships among data items. These relationships are expressed
in a tabular form_in columns and rows. Each column represents
one category of data, and each row can hold one data value for
each column.
A Microsoft Excel 5.0 worksheet, for example, is an electronic
spreadsheet consisting of a grid of 256 columns and 16,384 rows. The intersection
of a column and a row is called a cell. Columns and Rows of Data in an XLS File illustrates a portion
of a standard Excel worksheet.
Columns and Rows of Data in an XLS File
Column
letters for each column appear above the worksheet.
Columns are lettered A through IV (A to Z, AA to AZ, BA to BZ, and so on
to IV). Row numbers for each row appear to the left of the worksheet. Rows
are numbered 1 to 16,384. For Excel 4 files, only one worksheet (worksheet
1) is allowed per file, but more than one worksheet can be stored in a workbook.
You must convert any worksheets you store in a workbook back to worksheets
before you can use the data in a SAS program.
A range is a subset of cells in a worksheet. A range is
identified by
its address, which begins with the name of the top left cell and ends with
the name of the bottom right cell separated by two periods. For example,
the range B3..D6 is the range address for a rectangular block of 12 cells
whose top left cell is B3 and whose bottom right cell is D6 (as shaded in
the display).
The
following conventions apply to XLS filenames. Filenames must also follow
operating-system specific conventions, so check the documentation that comes
with your Microsoft Excel product or other software products for further information.
-
Under Windows 95, Windows 98, Windows NT, and
OS/2, the ACCESS and DBLOAD procedures support long names that are specified
in the PATH= statement (such as
path= 'c:\sasdemo\library\new_customer_1999.xls';). However, XLS files
with long names might not accepted by some versions of Microsoft Excel.
-
Filenames start with a letter, and they can contain
any combination of the letters A through Z, the digits 0 through 9, the underscore
(_), the hyphen (-), and spaces (blanks) within filenames.
-
Filenames can contain spaces. Filenames that
contain spaces or lowercase letters are supported by the ACCESS and DBLOAD
procedures, but they might not be accepted by some versions of Microsoft Excel.
Microsoft
Excel software has two data types: character and numeric. Microsoft Excel
character data may be entered as labels or formula strings; Microsoft Excel
numeric data may be entered as numbers or formulas.
Character data are generally considered text and can include dates
and numbers.
Numeric data
can include numbers (0 through 9), formulas, and cell
entries that begin with one of the following symbols: +, $, @, -, =,
or #. When you create and load an Excel file with PROC DBLOAD, the SAS/ACCESS engine
supplies
#NA for a missing,
numeric value.
Numeric data also can include date
and time values. In Microsoft Excel software, a date value is
the integer portion of a number that can range from 01 January 1900 to 31
December 2078, that is, 1 to 65380. A Microsoft Excel software time
value is the decimal portion of a number that represents time as a
proportion of a day. For example, 0.0 is midnight, 0.5 is noon, and 0.999988
is 23:59:59 (on a 24-hour clock). While a number can have both a date and
a time portion, the formats in Microsoft Excel display a number only in a
date, time, or datetime format. For information on how the SAS/ACCESS
interface handles date and time values and formats, see How the SAS System Handles Date and Time Values.
When you create an access descriptor, the interface software
uses the column types and formats in the XLS file to determine the corresponding
SAS variable formats. The SAS System generates its default formats based
on the values that you specify for the SCANTYPE, SKIPROWS, and GETNAMES statements
(or in the corresponding fields in the Access Descriptor Identification window).
You can change the formats generated by the software interface. For more
information, see How the SAS/ACCESS Interface Works.
When
you create an access descriptor, any data value that does not match the column
type (character or numeric) is treated as a missing value. This is the default
action. However, you can use the MIXED=YES statement to convert numeric data
values in a character column to their character representation.
You can also
set the SS_MIXED environment variable to
YES in your SAS configuration
file so that both numeric and character data are displayed as SAS character
data. Add this line to your SAS configuration file:
-SET SS_MIXED YES
See Setting Environment Variables
for more information on environment variables. For more information on changing
the column type from the type determined by SAS/ACCESS
software when you create an access descriptor, refer to the sections on XLS-specific
procedure statements later in this chapter.
The conversion of date and
time values between SAS data sets
and Microsoft Excel spreadsheets is transparent to users. However, you are
encouraged to understand the differences between them.
Microsoft Excel date and time values and formats are
described in XLS Data Types.
As described earlier in this chapter, an XLS date value is the
integer portion of a number that represents the number of days between January
1, 1900 and a specified date. An XLS time value is a decimal portion of a
number that represents time as a portion of the day. For example, 0.0 is 12:00:00
a.m., and 0.9999884 is 11:59:59 p.m. While a number can have both a date and
a time portion, the formats in XLS display a number only in a date format
or in a time format. For example, for 1:00 p.m., March 12, 1994, the XLS date
value is 34405, the time value is 0.5416667, and the datetime value is 34405.5416667.(footnote 1)
The SAS System handles date and time values differently
than XLS. A SAS date value is an integer that represents the number of days
between January 1, 1960 and a specified date. A SAS time value is an integer
that represents the number of seconds since midnight of the current day.
When a date and a time are both present, the SAS System stores the value as
the number of seconds since midnight, January 1, 1960. For example, for 1:00
p.m., March 12, 1994, the SAS date value is12489, and the SAS time value is
46800. Therefore, the SAS datetime value is 1079096400.
When you create an access descriptor, the SAS System
converts an XLS datetime format to its corresponding SAS datetime format if
an XLS datetime format is specified for the variable in the XLS file. Note
that if the datetime value does not have an XLS format in the XLS file, the
SAS System treats the datetime value like a numeric value.
To convert an XLS datetime format to a SAS datetime
format, you need a SAS datetime format in the access descriptor. For example,
changing the default SAS numeric format (15.2) to a SAS date format in the
descriptor causes the XLS date value (based on January 1, 1900) to be converted
to an equivalent SAS date value (based on January 1, 1960). In other words,
the XLS numeric value for January 1, 1960 (which is 21916) is converted to
the equivalent SAS representation of January 1, 1960 (which is 0) only if
a SAS datetime format is assigned in the descriptor for that column. Otherwise,
the XLS value of 21916 is treated as a SAS numeric value of 21916.
The following table shows how the SAS System uses a
Microsoft Excel datetime value to convert to a SAS datetime format.
If a SAS variable is specified with a date, time, or datetime
format in the FORMAT statement, the interface view engine converts that SAS
datetime format into the equivalent Microsoft Excel datetime format when the
new XLS file is created.
However, if a SAS datetime format is not specified in the input
SAS data set, you have to assign a format by using a PROC DBLOAD FORMAT statement.
Doing so assigns a Microsoft Excel datetime format to the SAS variable when
the variable is loaded into an XLS file. If you do not assign a SAS datetime
format, the SAS numeric value for the date is written to the XLS file. Because
SAS dates are based on January 1, 1960, and Microsoft Excel dates are based
on January 1, 1900, the date value in the XLS file will be inaccurate.
To maintain a SAS variable format in the input data
set, yet change it just while the DBLOAD procedure is in progress, use the
FORMAT statement in PROC DBLOAD. This statement enables you to assign a temporary
format to a SAS variable for the duration of the procedure without affecting
the input SAS data set.
For example, if the SAS format for the BIRTHDAT variable
in the MYDATA.SASEMPS access descriptor is left at the default 15.2 format,
you can specify the FORMAT statement to change the variable's format to DATE7.
while you are creating and loading the XLS file. When you load the XLS file,
the DATE7. format becomes an equivalent Microsoft column format, DDMMMYY.
When the DBLOAD procedure has completed, the SAS format for the BIRTHDAT
variable returns to the 15.2 format.
You can specify the FORMAT statement in the PROC DBLOAD
statement when you invoke the procedure using any of the methods of processing.
Use PROC ACCESS
to define descriptors that identify spreadsheet data and the conversions necessary
to use the data in SAS programs. The Microsoft Excel label data type is formatted
as a SAS character type, and the Microsoft Excel number data type is formatted
as a SAS numeric type.
Fonts, attributes, and colors in the XLS files are not
read into the SAS data sets. However, the ACCESS procedure supports most
of the XLS number formats and automatically converts them to the corresponding
SAS formats. Any XLS data strings longer than 200 characters are truncated
while being converted into SAS data sets, and any SAS data file created from
XLS files can contain up to 256 variables and 16,384 observations.
Default SAS System Variable Formats for XLS File Data
shows the default SAS System variable formats that the ACCESS procedure assigns
to each type of standard XLS file data. Default SAS System Variable Formats for Customized XLS Format Strings provides SAS System variable formats for
customized XLS format strings. XLS file numeric data include date and time
values. See How the SAS System Handles Date and Time Values
for more information.
Default SAS System Variable Formats for XLS File Data
|
XLS File Data |
SAS Variable Format |
|
Data Type |
XLS Format String |
Type |
Format |
|
Char (table note 1)
|
@ (table note 2)
|
Char |
$w. |
|
Numeric (table note 3)
|
General |
Num |
BEST |
|
Numeric |
0 |
Num |
w.d
|
|
Numeric |
0.00 |
Num |
w.d
|
|
Numeric |
#,##0 |
Num |
COMMAw.d |
|
Numeric |
#,##0.00 |
Num |
COMMAw.d |
|
Numeric |
#,##0_);(#,##0) |
Num |
NEGPARENw.d |
|
Numeric |
#,##0_);[Red](#,##0) |
Num |
NEGPARENw.d |
|
Numeric |
#,##0.00_);(#,##0.00) |
Num |
NEGPARENw.d |
|
Numeric |
#,##0.00_);[Red](#,##0.00) |
Num |
NEGPARENw.d |
|
Numeric |
$#,##0_);($#,##0) |
Num |
DOLLARw.d |
|
Numeric |
$#,##0_);[Red]($#,##0) |
Num |
DOLLARw.d |
|
Numeric |
($#,##0.00_);($#,##0.00) |
Num |
DOLLARw.d |
|
Numeric |
($#,##0.00_);[Red]($#,##0.00) |
Num |
DOLLARw.d |
|
Numeric |
_($*#,##0_);_($*(#,##0);_($*"-"_);_(@_) |
Num |
DOLLARw.d |
|
Numeric |
_(*#,##0_);_(*(#,##0);_(*"-"_);_(@_) |
Num |
NEGPARENw.d |
|
Numeric |
_($*#,##0.00_);_($*(#,##0.00);_($*"-"??_);_(@_) |
Num |
DOLLARw.d |
|
Numeric |
_(*#,##0.00_);_(*(#,##0.00);_(*"-"??_);_(@_) |
Num |
NEGPARENw.d |
|
Numeric |
0% |
Num |
PERCENTw.d |
|
Numeric |
0.00% |
Num |
PERCENTw.d |
|
Numeric |
0.00E+00 |
Num |
Ew.d |
|
Numeric |
##0.0E+0 |
Num |
Ew.d |
|
Numeric |
m/d/yy
|
Num |
MMDDYYw. |
|
Numeric |
d-mmm-yy
|
Num |
MMDDYYw. |
|
Numeric |
d-mmm
|
Num |
DATEw. |
|
Numeric |
mmm-yy
|
Num |
MONYYw. |
|
Numeric |
h:mm AM/PM |
Num |
TIMEw. |
|
Numeric |
h:mm:ss AM/PM |
Num |
TIMEw. |
|
Numeric |
h:mm
|
Num |
TIMEw. |
|
Numeric |
hh:mm |
Num |
TIMEw. |
|
Numeric |
h:mm:ss
|
Num |
TIMEw. |
|
Numeric |
hh:mm:ss
|
Num |
TIMEw. |
|
Numeric |
m/d/yy h:mm
|
Num |
DATETIMEw. |
|
Numeric |
ddmmmyy
|
Num |
DATEw. |
|
Numeric |
ddmmmyyyy:hh:mm:ss
|
Num |
DATETIMEw. |
|
Numeric |
dd
|
Num |
DATEw. |
|
Numeric |
dd/mm/yy
|
Num |
DDMMYYw. |
|
Numeric |
dddd
|
Num |
DATEw. |
|
Numeric |
mm/dd/yy
|
Num |
MMDDYYw. |
|
Numeric |
mm:ss
|
Num |
MMSSw. |
|
Numeric |
mm yy
|
Num |
MONYYw. |
|
Numeric |
mm yyyy
|
Num |
MONYYw. |
|
Numeric |
mm:yy
|
Num |
MONYYw. |
|
Numeric |
mm:yyyy
|
Num |
MONYYw. |
|
Numeric |
mm-yy
|
Num |
MONYYw. |
|
Numeric |
mm-yyyy
|
Num |
MONYYw. |
|
Numeric |
mmyy
|
Num |
MONYYw. |
|
Numeric |
mmyyyy
|
Num |
MONYYw. |
|
Numeric |
mm.yy
|
Num |
MONYYw. |
|
Numeric |
mm.yyyy
|
Num |
MONYYw. |
|
Numeric |
mm/yy
|
Num |
MONYYw. |
|
Numeric |
mm/yyyy
|
Num |
MONYYw. |
|
Numeric |
mmmm
|
Num |
MONYYw. |
|
Numeric |
m
|
Num |
MONYYw. |
|
Numeric |
mmmyy
|
Num |
MONYYw. |
|
Numeric |
mmmyyyy
|
Num |
MONYYw. |
|
Numeric |
dddd, mmmm dd, yyyy
|
Num |
MONYYw. |
|
Numeric |
dddd, dd mmmm yyyy
|
Num |
MONYYw. |
|
Numeric |
mmmm dd, yyyy
|
Num |
MONYYw. |
|
Numeric |
dd mmmm yyyy
|
Num |
MONYYw. |
|
Numeric |
yy
|
Num |
YYMMDDw. |
|
Numeric |
yyyy
|
Num |
YYMMDDw. |
|
Numeric |
yy mm
|
Num |
YYMMDDw. |
|
Numeric |
yyyy mm
|
Num |
YYMMDDw. |
|
Numeric |
yy:mm
|
Num |
YYMMDDw. |
|
Numeric |
yyyy:mm
|
Num |
YYMMDDw. |
|
Numeric |
yy-mm
|
Num |
YYMMDDw. |
|
Numeric |
yyyy-mm
|
Num |
YYMMDDw. |
|
Numeric |
yymm
|
Num |
YYMMDDw. |
|
Numeric |
yyyymm
|
Num |
YYMMDDw. |
|
Numeric |
yy.mm
|
Num |
YYMMDDw. |
|
Numeric |
yyyy.mm
|
Num |
YYMMDDw. |
|
Numeric |
yy/mm
|
Num |
YYMMDDw. |
|
Numeric |
yyyy/mm
|
Num |
YYMMDDw. |
|
Numeric |
yy-mm-dd
|
Num |
YYMMDDw. |
|
Numeric |
yymmm
|
Num |
YYMMDDw. |
|
Numeric |
yyyymmm
|
Num |
YYMMDDw. |
TABLE NOTE 1:
Label
data.
TABLE NOTE 2:
The XLS
character format for Excel Version 5.0.
TABLE NOTE 3:
Number, formula, or missing data.
Note
that w is based on
Excel column width; .d is controlled by the Excel
format string.
If XLS file data fall outside of
the valid SAS data ranges, you receive an error message in the SAS log when
you try to access the data.
The
SAS/ACCESS
interface does not fully support the Microsoft Excel hidden and text formats.
XLS data in hidden format are displayed in SAS data sets; however, you can
drop the hidden column when you are creating the access descriptor. If you
want to display the formula in the text format, add a space to indicate that
the formula entry is a label. Otherwise, the results of the formula are displayed.
If you have
set the SS_MIXED environment variable to YES, the
numerical values in XLS files are converted to character strings in SAS data
sets if the corresponding SAS variable type is specified as character.
This
section explains how SAS data are read into Microsoft Excel data when a table
is loaded. In this conversion, the SAS character data type is converted into
the Microsoft Excel label type and the SAS numeric type is converted into
the Microsoft Excel number type.
The SAS/ACCESS interface
automatically converts SAS formats to the same or associated Microsoft Excel
formats and column widths. However, you can temporarily assign other formats
and column widths to SAS variables by using the FORMAT statement so that the
loaded XLS file columns have the formats you want. Converting SAS System Variable Formats to XLS File Data shows the SAS System variable types and
formats and the XLS data types, formats, and column widths that you can assign
them to.
Note:
The FORMAT statement in PROC DBLOAD only
changes the format of SAS variables while you are creating and loading the
XLS files. When the procedure is completed, the formats of SAS variables
return to their original settings. ![[cautend]](../common/images/cautend.gif)
XLS values are numeric data. See How the SAS System Handles Date and Time Values for more
information.
Converting SAS System Variable Formats to XLS File Data
|
SAS Variable Format |
XLS File Data |
|
Type |
Format |
XLS Format String |
Data Type |
|
Char |
" " |
General |
LABEL |
|
Char |
$CHAR |
General |
LABEL |
|
Char |
$ |
General |
LABEL |
|
Num |
BESTw.d |
General |
NUMBER |
|
Num |
COMMAw.d |
#,##0 |
NUMBER |
|
Num |
COMMAXw.d |
#,##0 |
NUMBER |
|
Num |
DATEw. |
ddmmmyy
|
NUMBER |
|
Num |
DATETIMEw.d |
ddmmmyyyy:hh:mm:ss |
NUMBER |
|
Num |
DAYw. |
dd |
NUMBER |
|
Num |
DDMMYYw. |
dd/mm/yy |
NUMBER |
|
Num |
DOLLARw.d |
"$"#,##0_);("$"#,##0) |
NUMBER |
|
Num |
DOLLARXw.d |
"$"#,##0_);("$"#,##0) |
NUMBER |
|
Num |
DOWNAMEw.d |
dddd
|
NUMBER |
|
Num |
Ew. |
0.00E+00 |
NUMBER |
|
Num |
HHMMw.d |
h:mm |
NUMBER |
|
Num |
HOURw.d |
h:mm |
NUMBER |
|
Num |
JULDAYw. |
m/d/yy |
NUMBER |
|
Num |
JULIANw. |
m/d/yy |
NUMBER |
|
Num |
MMDDYYw. |
mm/dd/yy |
NUMBER |
|
Num |
MMSSw.d |
mm:ss |
NUMBER |
|
Num |
MMYYxw. |
mmyy
|
NUMBER |
|
Num |
MMYYC |
mm:yy |
NUMBER |
|
Num |
MMYYD |
mm-yy |
NUMBER |
|
Num |
MMYYN |
mmyy
|
NUMBER |
|
Num |
MMYYP |
mm.yy |
NUMBER |
|
Num |
MMYYS |
mm/yy |
NUMBER |
|
Num |
MONNAMEw. |
mmmm
|
NUMBER |
|
Num |
MONTHw. |
m
|
NUMBER |
|
Num |
MONYYw. |
mmmyy
|
NUMBER |
|
Num |
NEGPARENw.d |
#,##0_);(#,##0) |
NUMBER |
|
Num |
NENGOw. |
m/d/yy |
NUMBER |
|
Num |
PERCENTw.d |
0% |
NUMBER |
|
Num |
QTRw. |
m/d/yy |
NUMBER |
|
Num |
QTRRw. |
m/d/yy |
NUMBER |
|
Num |
SSNw. |
000-00-0000 |
NUMBER |
|
Num |
TIMEw.d |
h:mm:ss |
NUMBER |
|
Num |
TODw. |
h:mm:ss |
NUMBER |
|
Num |
W |
0 |
NUMBER |
|
Num |
WEEKDATEw. |
dddd, mmmmdd,
yyyy
|
NUMBER |
|
Num |
WEEKDATXw. |
dddd, dd
mmmmyyyy
|
NUMBER |
|
Num |
WEEKDAYw. |
m/d/yy |
NUMBER |
|
Num |
WORDDATEw. |
mmmmdd, yyyy |
NUMBER |
|
Num |
WORDDATXw. |
ddmmmmyyyy
|
NUMBER |
|
Num |
YEARw. |
yy or yyyy |
NUMBER |
|
Num |
YYMM |
yy mm
|
NUMBER |
|
Num |
YYMMC |
yy:mm |
NUMBER |
|
Num |
YYMMD |
yy-mm |
NUMBER |
|
Num |
YYMMN |
yymm
|
NUMBER |
|
Num |
YYMMP |
yy.mm |
NUMBER |
|
Num |
YYMMS |
yy/mm |
NUMBER |
|
Num |
YYMMDDw. |
yy-mm-dd |
NUMBER |
|
Num |
YYMONw. |
yymmm
|
NUMBER |
|
Num |
Zw.d |
0w.d |
NUMBER |
|
Num |
FRACTw. |
# ?/? |
NUMBER |
Note that Excel
column widths are set to w and displayed in the
column. If data are larger than column width, the data are displayed as pound
signs (###), in which case the data can be viewed by adjusting the column
width.
FOOTNOTE 1:
In this description, datetime (in lowercase)
refers to any value or format that represents a date, a time, or both a date
and a time.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.