LMS and LTS Calls
performs robust regression
 CALL LMS( sc, coef, wgt, opt, y<, <x><,
sorb>>);
robust (resistant) regression method, defined by
minimizing the hth ordered squared residual.
 CALL LTS( sc, coef, wgt, opt, y<, <x><,
sorb>>);
robust (resistant) regression method, defined by
minimizing the sum of the h smallest squared
residuals.
The Least Median of Squares (LMS) and Least
Trimmed Squares (LTS) subroutines perform robust regression
(sometimes called resistant regression). They are able
to detect outliers and perform a leastsquares regression on
the remaining observations.
The value of h may be specified, but in most applications
the default value works just fine and the results seem to be
quite stable toward different choices of h.
The inputs to the LMS and LTS subroutines are as follows:
 opt
 refers to an options vector with the following components
(missing values are treated as default values).
The options vector can be a null vector.
 opt[1]
 specifies whether an intercept is used in the model
(opt[1]=0) or not (opt[1]).
If opt[1]=0, then a column of 1s is added as
the last column to the input matrix X; that
is, you do not need to add this column of 1s yourself.
The default is opt[1]=0.
 opt[2]
 specifies the amount of printed output.
Higher values request additional output
and include the output of lower values.
 opt[2]=0
 prints no output except error messages.
 opt[2]=1
 prints all output except (1) arrays of O(N),
such as weights, residuals, and diagnostics; (2)
the history of the optimization process; and (3)
subsets that result in singular linear systems.
 opt[2]=2
 additionally prints arrays of O(N), such
as weights, residuals, and diagnostics;
also prints the case numbers of the
observations in the best subset and some
basic history of the optimization process.
 opt[2]=3
 additionally prints subsets that
result in singular linear systems.
The default is opt[2]=0.
 opt[3]
 specifies whether only LMS or LTS is computed or
whether, additionally, leastsquares (LS) and
weighted leastsquares (WLS) regression are computed:
 opt[3]=0
 computes only LMS or LTS.
 opt[3]=1
 computes, in addition to LMS or LTS, weighted
leastsquares regression on the observations
with small LMS or LTS residuals (where
small is defined by opt[8]).
 opt[3]=2
 computes, in addition to LMS or LTS,
unweighted leastsquares regression.
 opt[3]=3
 adds both unweighted and weighted leastsquares
regression to LMS and LTS regression.
The default is opt[3]=0.
 opt[4]
 specifies the quantile h to be minimized.
This is used in the objective function.
The default is opt[5]= h = [[(N+n+1)/2]], which
corresponds to the highest possible breakdown value.
This is also the default of the PROGRESS program.
The value of h should be in the range
 opt[5]
 specifies the number N_{Rep} of generated subsets.
Each subset consists of n observations
(k_{1}, ... ,k_{n}), where .
The total number of subsets consisting of
n observations out of N observations is
where n is the number of
parameters including the intercept.
Due to computer time restrictions, not all subset
combinations of n observations out of N can
be inspected for larger values of N and n.
Specifying a value of N_{Rep} < N_{tot}
enables you to save computer time at the
expense of computing a suboptimal solution.
If opt[5] is zero or missing, the default
number of subsets is taken from the following table.
n

1

2

3

4

5

6

7

8

9

10

N_{lower}  500  50  22  17  15  14  0  0  0  0 
N_{upper}  1000000  1414  182  71  43  32  27  24  23  22 
N_{Rep}  500  1000  1500  2000  2500  3000  3000  3000  3000  3000 
n

11

12

13

14

15

N_{lower}  0  0  0  0  0 
N_{upper}  22  22  22  23  23 
N_{Rep}  3000  3000  3000  3000  3000 


 If the number of cases (observations) N is smaller than
N_{lower}, then all possible subsets are used;
otherwise, N_{Rep} subsets are drawn randomly.
This means that an exhaustive search
is performed for opt[6]=1.
If N is larger than N_{upper}, a note is printed
in the log file indicating how many subsets exist.
 opt[7]
 specifies that the latest argument sorb contains a
given parameter vector b rather than a given subset
for which the objective function should be evaluated.
 opt[8]
 is relevant only for LS and WLS
regression (opt[3] > 0).
It specifies whether the covariance matrix of
parameter estimates and approximate standard
errors (ASEs) are computed and printed:
 opt[8]=0
 does not compute covariance matrix and ASEs.
 opt[8]=1
 computes the covariance matrix and ASEs
but prints only the ASEs.
 opt[8]=3
 computes and prints both the covariance
matrix and the ASEs.
The default is opt[8]=0.
 y
 refers to an N response vector y.
 x
 refers to an N ×n matrix X of regressors.
If opt[1] is zero or missing, an intercept
is added by
default as the last column of X.
If the matrix X is not specified,
y is analyzed as a univariate data
set.
 sorb
 refers to an n vector containing either of the following:

n observation numbers of a subset for which
the objective function should be evaluated;
this subset can be the start for a pairwise
exchange algorithm if opt[4] is specified.

n given parameters b = (b_{1}, ... ,b_{n})
(including the intercept, if necessary) for
which the objective function should be evaluated.
Missing values are not permitted in x or y.
Missing values in opt cause the default value to be used.
The LMS and LTS subroutines return the following values:
 sc
 is a column vector containing the following scalar information,
where rows 1 9 correspond to LMS or LTS regression and
rows 11 14 correspond to either LS or WLS:
 sc[1]
 the quantile h used in the objective function
 sc[2]
 number of subsets generated
 sc[3]
 number of subsets with singular linear systems
 sc[4]
 number of nonzero weights w_{i}
 sc[5]
 lowest value of the objective function
F_{LMS} or F_{LTS} attained
 sc[6]
 preliminary LMS or LTS scale estimate S_{P}
 sc[7]
 final LMS or LTS scale estimate S_{F}
 sc[8]
 robust R^{2} (coefficient of determination)
 sc[9]
 asymptotic consistency factor
If opt[3] > 0, then the following can also be set:
 sc[11]
 LS or WLS objective function (sum of squared residuals)
 sc[12]
 LS or WLS scale estimate
 sc[13]
 R^{2} value for LS or WLS
 sc[14]
 F value for LS or WLS
For opt[3]=1 or opt[3]=3, these rows
correspond to WLS estimates; for opt[3]=2,
these rows correspond to LS estimates.
 coef
 is a matrix with n columns containing
the following results in its rows:
 coef[1,]
 LMS or LTS parameter estimates
 coef[2,]
 indices of observations in the best subset
If opt[3] > 0, then the following can also be set:
 coef[3]
 LS or WLS parameter estimates
 coef[4]
 approximate standard errors of LS or WLS estimates
 coef[5]
 tvalues
 coef[6]
 pvalues
 coef[7]
 lower boundary of Wald confidence intervals
 coef[8]
 upper boundary of Wald confidence intervals
For opt[3]=1 or opt[3]=3, these rows correspond
to WLS estimates; for opt[3]=2, to LS estimates.
 wgt
 is a matrix with N columns containing
the following results in its rows:
 wgt[1]
 weights (=1 for small, =0 for large residuals)
 wgt[2]
 residuals r_{i} = y_{i}  x_{i} b
 wgt[3]
 resistant diagnostic u_{i} (note that the resistant
diagnostic cannot be computed for a perfect fit
when the objective function is zero or nearly zero)
Example
Consider results for Brownlee's (1965) stackloss
data.
The three explanatory variables correspond to
measurements for a plant oxidizing ammonia to nitric acid:
 x_{1} air flow to the plant
 x_{2} cooling water inlet temperature
 x_{3} acid concentration
on 21 consecutive days. The response variable y_{i} gives
the permillage of ammonia lost (stackloss). The data are also
given by Rousseeuw & Leroy (1987, p.76) and
Osborne (1985, p.267):
print "Stackloss Data";
aa = { 1 80 27 89 42,
1 80 27 88 37,
1 75 25 90 37,
1 62 24 87 28,
1 62 22 87 18,
1 62 23 87 18,
1 62 24 93 19,
1 62 24 93 20,
1 58 23 87 15,
1 58 18 80 14,
1 58 18 89 14,
1 58 17 88 13,
1 58 18 82 11,
1 58 19 93 12,
1 50 18 89 8,
1 50 18 86 7,
1 50 19 72 8,
1 50 19 79 8,
1 50 20 80 9,
1 56 20 82 15,
1 70 20 91 15 };
Rousseeuw & Leroy (1987, p.76) cite a large number of
papers where this data set was analyzed before and state
that most researchers "concluded that observations 1, 3,
4, and 21 were outliers," and some people also reported
observation 2 as outlier.
For N=21 and n=4 (three explanatory variables including
intercept), you obtain a total of 5985 different subsets of
4 observations out of 21. If you decide not to specify
optn[5], your LMS and LTS algorithms draw N_{rep}=2000
random sample subsets. Since there is a large number of subsets
with singular linear systems, which you do not want to print,
you chose optn[2]=2; for reduced printed output.
a = aa[,2:4]; b = aa[,5];
optn = j(8,1,.);
optn[2]= 2; /* ipri */
optn[3]= 3; /* ilsq */
optn[8]= 3; /* icov */
CALL LMS(sc,coef,wgt,optn,b,a);
LMS: The 13th ordered squared residual will be minimized.
Median and Mean
Median Mean
VAR1 58 60.428571429
VAR2 20 21.095238095
VAR3 87 86.285714286
Intercep 1 1
Response 15 17.523809524
Dispersion and Standard Deviation
Dispersion StdDev
VAR1 5.930408874 9.1682682584
VAR2 2.965204437 3.160771455
VAR3 4.4478066555 5.3585712381
Intercep 0 0
Response 5.930408874 10.171622524
The following are the results of LS regression:
Unweighted LeastSquares Estimation
LS Parameter Estimates
Approx Pr >
Variable Estimate Std Err t Value t
VAR1 0.715640 0.134858 5.31 <.0001
VAR2 1.295286 0.368024 3.52 0.0026
VAR3 0.152123 0.156294 0.97 0.3440
Intercep 39.919674 11.895997 3.36 0.0038
Variable Lower WCI Upper WCI
VAR1 0.451323 0.979957
VAR2 0.573972 2.016600
VAR3 0.458453 0.154208
Intercep 63.2354 16.603949
Sum of Squares = 178.8299616
Degrees of Freedom = 17
LS Scale Estimate = 3.2433639182
Cov Matrix of Parameter Estimates
VAR1 VAR2 VAR3 Intercep
VAR1 0.018187 0.036511 0.007144 0.287587
VAR2 0.036511 0.135442 0.000010 0.651794
VAR3 0.007144 0.000011 0.024428 1.676321
Intercep 0.287587 0.651794 1.676321 141.514741
Rsquared = 0.9135769045
F(3,17) Statistic = 59.9022259
Probability = 3.0163272E9
These are the LMS results for the 2000 random subsets:
Random Subsampling for LMS
Best
Subset Singular Criterion Percent
500 23 0.163262 25
1000 55 0.140519 50
1500 79 0.140519 75
2000 103 0.126467 100
Minimum Criterion= 0.1264668282
Least Median of Squares (LMS) Method
Minimizing 13th Ordered Squared Residual.
Highest Possible Breakdown Value = 42.86 %
Random Selection of 2103 Subsets
Among 2103 subsets 103 are singular.
Observations of Best Subset
15 11 19 10
Estimated Coefficients
VAR1 VAR2 VAR3 Intercep
0.75 0.5 0 39.25
LMS Objective Function = 0.75
Preliminary LMS Scale = 1.0478510755
Robust R Squared = 0.96484375
Final LMS Scale = 1.2076147288
For LMS observations, 1, 3, 4, and 21 have scaled residuals
larger than 2.5 (table not shown) and are considered outliers.
These are the corresponding WLS results:
Weighted LeastSquares Estimation
RLS Parameter Estimates Based on LMS
Approx Pr >
Variable Estimate Std Err t Value t
VAR1 0.797686 0.067439 11.83 <.0001
VAR2 0.577340 0.165969 3.48 0.0041
VAR3 0.067060 0.061603 1.09 0.2961
Intercep 37.652459 4.732051 7.96 <.0001
Lower WCI Upper WCI
0.665507 0.929864
0.252047 0.902634
0.187800 0.053680
46.927108 28.37781
Weighted Sum of Squares = 20.400800254
Degrees of Freedom = 13
RLS Scale Estimate = 1.2527139846
Cov Matrix of Parameter Estimates
VAR1 VAR2 VAR3 Intercep
VAR1 0.004548 0.007921 0.001199 0.001568
VAR2 0.007921 0.027546 0.000463 0.065018
VAR3 0.001199 0.000463 0.003795 0.246102
Intercep 0.001568 0.065018 0.246102 22.392305
Weighted Rsquared = 0.9750062263
F(3,13) Statistic = 169.04317954
Probability = 1.158521E10
There are 17 points with nonzero weight.
Average Weight = 0.8095238095
References

Brownlee, K.A. (1965),
Statistical Theory and Methodology in Science and
Engineering, New York: John Wiley & Sons, Inc.

Davies, L. (1992),
``The Asymptotics of Rousseeuw's Minimum Volume Ellipsoid
Estimator,'' The Annals of Statistics, 20,
1828 1843.

Rousseeuw, P.J. (1984),
"Least Median of Squares Regression,"
Journal of the American Statistical Association,
79, 871 880.

Rousseeuw, P.J. (1985),
"Multivariate Estimation with High Breakdown Point,"
in Mathematical Statistics and Applications,
ed. by W. Grossmann, G. Pflug, I. Vincze, and W. Wertz,
Dordrecht: Reidel Publishing Company, 283 297.

Rousseeuw, P.J. and Croux, C. (1993),
"Alternatives to the Median Absolute Deviation,"
Journal of the American Statistical Association,
88, 1273 1283.

Rousseeuw, P.J. and Hubert, M. (1997),
"Recent Developments in PROGRESS,"
in L_{1}Statistical Procedures and Related Topics,
ed. by Y. Dodge, IMS Lecture Notes  Monograph Series,
No. 31, 201 214.

Rousseeuw, P.J. and Leroy, A.M. (1987),
Robust Regression and Outlier Detection,
New York: John Wiley & Sons, Inc.

Rousseeuw, P.J. and Van Driessen, K. (1997),
``A fast Algorithm for the Minimum Covariance Determinant
Estimator,'' submitted for publication.

Rousseeuw, P.J. and Van Zomeren, B.C. (1990),
"Unmasking Multivariate Outliers and Leverage Points,"
Journal of the American Statistical Association,
85, 633 639.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.