Chapter Contents Previous Next
 Language Reference

## LMS and LTS Calls

performs robust regression

CALL LMS( sc, coef, wgt, opt, y<, <x><, sorb>>);

robust (resistant) regression method, defined by minimizing the hth ordered squared residual.

CALL LTS( sc, coef, wgt, opt, y<, <x><, sorb>>);

robust (resistant) regression method, defined by minimizing the sum of the h smallest squared residuals.

The Least Median of Squares (LMS) and Least Trimmed Squares (LTS) subroutines perform robust regression (sometimes called resistant regression). They are able to detect outliers and perform a least-squares regression on the remaining observations.

The value of h may be specified, but in most applications the default value works just fine and the results seem to be quite stable toward different choices of h.

The inputs to the LMS and LTS subroutines are as follows:

opt
refers to an options vector with the following components (missing values are treated as default values). The options vector can be a null vector.

opt[1]
specifies whether an intercept is used in the model (opt[1]=0) or not (opt[1]). If opt[1]=0, then a column of 1s is added as the last column to the input matrix X; that is, you do not need to add this column of 1s yourself. The default is opt[1]=0.

opt[2]
specifies the amount of printed output. Higher values request additional output and include the output of lower values.

opt[2]=0
prints no output except error messages.

opt[2]=1
prints all output except (1) arrays of O(N), such as weights, residuals, and diagnostics; (2) the history of the optimization process; and (3) subsets that result in singular linear systems.

opt[2]=2
additionally prints arrays of O(N), such as weights, residuals, and diagnostics; also prints the case numbers of the observations in the best subset and some basic history of the optimization process.

opt[2]=3
additionally prints subsets that result in singular linear systems.

The default is opt[2]=0.

opt[3]
specifies whether only LMS or LTS is computed or whether, additionally, least-squares (LS) and weighted least-squares (WLS) regression are computed:

opt[3]=0
computes only LMS or LTS.

opt[3]=1
computes, in addition to LMS or LTS, weighted least-squares regression on the observations with small LMS or LTS residuals (where small is defined by opt[8]).

opt[3]=2
computes, in addition to LMS or LTS, unweighted least-squares regression.

opt[3]=3
adds both unweighted and weighted least-squares regression to LMS and LTS regression.

The default is opt[3]=0.

opt[4]
specifies the quantile h to be minimized. This is used in the objective function. The default is opt[5]= h = [[(N+n+1)/2]], which corresponds to the highest possible breakdown value. This is also the default of the PROGRESS program. The value of h should be in the range

opt[5]
specifies the number NRep of generated subsets. Each subset consists of n observations (k1, ... ,kn), where . The total number of subsets consisting of n observations out of N observations is
where n is the number of parameters including the intercept.

Due to computer time restrictions, not all subset combinations of n observations out of N can be inspected for larger values of N and n. Specifying a value of NRep < Ntot enables you to save computer time at the expense of computing a suboptimal solution.

If opt[5] is zero or missing, the default number of subsets is taken from the following table.

 n 1 2 3 4 5 6 7 8 9 10 Nlower 500 50 22 17 15 14 0 0 0 0 Nupper 1000000 1414 182 71 43 32 27 24 23 22 NRep 500 1000 1500 2000 2500 3000 3000 3000 3000 3000

 n 11 12 13 14 15 Nlower 0 0 0 0 0 Nupper 22 22 22 23 23 NRep 3000 3000 3000 3000 3000

If the number of cases (observations) N is smaller than Nlower, then all possible subsets are used; otherwise, NRep subsets are drawn randomly. This means that an exhaustive search is performed for opt[6]=-1. If N is larger than Nupper, a note is printed in the log file indicating how many subsets exist.

opt[7]
specifies that the latest argument sorb contains a given parameter vector b rather than a given subset for which the objective function should be evaluated.

opt[8]
is relevant only for LS and WLS regression (opt[3] > 0). It specifies whether the covariance matrix of parameter estimates and approximate standard errors (ASEs) are computed and printed:

opt[8]=0
does not compute covariance matrix and ASEs.

opt[8]=1
computes the covariance matrix and ASEs but prints only the ASEs.

opt[8]=3
computes and prints both the covariance matrix and the ASEs.

The default is opt[8]=0.

y
refers to an N response vector y.

x
refers to an N ×n matrix X of regressors. If opt[1] is zero or missing, an intercept is added by default as the last column of X. If the matrix X is not specified, y is analyzed as a univariate data set.

sorb
refers to an n vector containing either of the following:

• n observation numbers of a subset for which the objective function should be evaluated; this subset can be the start for a pairwise exchange algorithm if opt[4] is specified.
• n given parameters b = (b1, ... ,bn) (including the intercept, if necessary) for which the objective function should be evaluated.

Missing values are not permitted in x or y. Missing values in opt cause the default value to be used.

The LMS and LTS subroutines return the following values:

sc
is a column vector containing the following scalar information, where rows 1 -9 correspond to LMS or LTS regression and rows 11 -14 correspond to either LS or WLS:

sc[1]
the quantile h used in the objective function

sc[2]
number of subsets generated

sc[3]
number of subsets with singular linear systems

sc[4]
number of nonzero weights wi

sc[5]
lowest value of the objective function FLMS or FLTS attained

sc[6]
preliminary LMS or LTS scale estimate SP

sc[7]
final LMS or LTS scale estimate SF

sc[8]
robust R2 (coefficient of determination)

sc[9]
asymptotic consistency factor

If opt[3] > 0, then the following can also be set:

sc[11]
LS or WLS objective function (sum of squared residuals)

sc[12]
LS or WLS scale estimate

sc[13]
R2 value for LS or WLS

sc[14]
F value for LS or WLS

For opt[3]=1 or opt[3]=3, these rows correspond to WLS estimates; for opt[3]=2, these rows correspond to LS estimates.

coef
is a matrix with n columns containing the following results in its rows:

coef[1,]
LMS or LTS parameter estimates

coef[2,]
indices of observations in the best subset

If opt[3] > 0, then the following can also be set:

coef[3]
LS or WLS parameter estimates

coef[4]
approximate standard errors of LS or WLS estimates

coef[5]
t-values

coef[6]
p-values

coef[7]
lower boundary of Wald confidence intervals

coef[8]
upper boundary of Wald confidence intervals

For opt[3]=1 or opt[3]=3, these rows correspond to WLS estimates; for opt[3]=2, to LS estimates.

wgt
is a matrix with N columns containing the following results in its rows:

wgt[1]
weights (=1 for small, =0 for large residuals)

wgt[2]
residuals ri = yi - xi b

wgt[3]
resistant diagnostic ui (note that the resistant diagnostic cannot be computed for a perfect fit when the objective function is zero or nearly zero)

### Example

Consider results for Brownlee's (1965) stackloss data. The three explanatory variables correspond to measurements for a plant oxidizing ammonia to nitric acid:
• x1 air flow to the plant
• x2 cooling water inlet temperature
• x3 acid concentration
on 21 consecutive days. The response variable yi gives the permillage of ammonia lost (stackloss). The data are also given by Rousseeuw & Leroy (1987, p.76) and Osborne (1985, p.267):
 print "Stackloss Data";
aa = { 1  80  27  89  42,
1  80  27  88  37,
1  75  25  90  37,
1  62  24  87  28,
1  62  22  87  18,
1  62  23  87  18,
1  62  24  93  19,
1  62  24  93  20,
1  58  23  87  15,
1  58  18  80  14,
1  58  18  89  14,
1  58  17  88  13,
1  58  18  82  11,
1  58  19  93  12,
1  50  18  89   8,
1  50  18  86   7,
1  50  19  72   8,
1  50  19  79   8,
1  50  20  80   9,
1  56  20  82  15,
1  70  20  91  15 };


Rousseeuw & Leroy (1987, p.76) cite a large number of papers where this data set was analyzed before and state that most researchers "concluded that observations 1, 3, 4, and 21 were outliers," and some people also reported observation 2 as outlier.

For N=21 and n=4 (three explanatory variables including intercept), you obtain a total of 5985 different subsets of 4 observations out of 21. If you decide not to specify optn[5], your LMS and LTS algorithms draw Nrep=2000 random sample subsets. Since there is a large number of subsets with singular linear systems, which you do not want to print, you chose optn[2]=2; for reduced printed output.

  a = aa[,2:4]; b = aa[,5];
optn = j(8,1,.);
optn[2]= 2;    /* ipri */
optn[3]= 3;    /* ilsq */
optn[8]= 3;    /* icov */

CALL LMS(sc,coef,wgt,optn,b,a);


   LMS: The 13th ordered squared residual will be minimized.

Median and Mean

Median              Mean

VAR1                    58      60.428571429
VAR2                    20      21.095238095
VAR3                    87      86.285714286
Intercep                 1                 1
Response                15      17.523809524

Dispersion and Standard Deviation

Dispersion            StdDev

VAR1           5.930408874      9.1682682584
VAR2           2.965204437       3.160771455
VAR3          4.4478066555      5.3585712381
Intercep                 0                 0
Response       5.930408874      10.171622524


The following are the results of LS regression:

                    Unweighted Least-Squares Estimation

LS Parameter Estimates

Approx                 Pr >
Variable    Estimate     Std Err    t Value     |t|

VAR1        0.715640   0.134858     5.31      <.0001
VAR2        1.295286   0.368024     3.52      0.0026
VAR3       -0.152123   0.156294    -0.97      0.3440
Intercep   -39.919674 11.895997    -3.36      0.0038

Variable    Lower WCI   Upper WCI

VAR1        0.451323  0.979957
VAR2        0.573972  2.016600
VAR3       -0.458453  0.154208
Intercep   -63.2354  -16.603949

Sum of Squares = 178.8299616
Degrees of Freedom = 17
LS Scale Estimate = 3.2433639182

Cov Matrix of Parameter Estimates

VAR1        VAR2         VAR3       Intercep

VAR1        0.018187   -0.036511    0.007144    0.287587
VAR2       -0.036511    0.135442    0.000010   -0.651794
VAR3       -0.007144    0.000011    0.024428   -1.676321
Intercep    0.287587   -0.651794    1.676321  141.514741

R-squared = 0.9135769045
F(3,17) Statistic = 59.9022259
Probability = 3.0163272E-9


These are the LMS results for the 2000 random subsets:

            Random Subsampling for LMS

Best
Subset    Singular       Criterion     Percent

500          23        0.163262          25
1000          55        0.140519          50
1500          79        0.140519          75
2000         103        0.126467         100

Minimum Criterion= 0.1264668282
Least Median of Squares (LMS) Method
Minimizing 13th Ordered Squared Residual.
Highest Possible Breakdown Value =  42.86 %
Random Selection of 2103 Subsets
Among 2103 subsets 103 are singular.

Observations of Best Subset

15                11                19                10

Estimated Coefficients

VAR1              VAR2              VAR3          Intercep

0.75               0.5                 0            -39.25

LMS Objective Function = 0.75
Preliminary LMS Scale = 1.0478510755
Robust R Squared = 0.96484375
Final LMS Scale = 1.2076147288


For LMS observations, 1, 3, 4, and 21 have scaled residuals larger than 2.5 (table not shown) and are considered outliers. These are the corresponding WLS results:

                       Weighted Least-Squares Estimation

RLS Parameter Estimates Based on LMS

Approx             Pr >
Variable    Estimate     Std Err  t Value     |t|

VAR1        0.797686  0.067439    11.83     <.0001
VAR2        0.577340  0.165969     3.48     0.0041
VAR3       -0.067060  0.061603    -1.09     0.2961
Intercep  -37.652459  4.732051    -7.96     <.0001

Lower WCI   Upper WCI

0.665507    0.929864
0.252047    0.902634
-0.187800    0.053680
-46.927108  -28.37781

Weighted Sum of Squares = 20.400800254
Degrees of Freedom = 13
RLS Scale Estimate = 1.2527139846

Cov Matrix of Parameter Estimates

VAR1         VAR2         VAR3         Intercep

VAR1       0.004548    -0.007921    -0.001199     0.001568
VAR2      -0.007921     0.027546    -0.000463    -0.065018
VAR3      -0.001199    -0.000463     0.003795    -0.246102
Intercep   0.001568    -0.065018    -0.246102    22.392305

Weighted R-squared = 0.9750062263
F(3,13) Statistic = 169.04317954
Probability = 1.158521E-10
There are 17 points with nonzero weight.
Average Weight = 0.8095238095


### References

• Brownlee, K.A. (1965), Statistical Theory and Methodology in Science and Engineering, New York: John Wiley & Sons, Inc.
• Davies, L. (1992), The Asymptotics of Rousseeuw's Minimum Volume Ellipsoid Estimator,'' The Annals of Statistics, 20, 1828 -1843.
• Rousseeuw, P.J. (1984), "Least Median of Squares Regression," Journal of the American Statistical Association, 79, 871 -880.
• Rousseeuw, P.J. (1985), "Multivariate Estimation with High Breakdown Point," in Mathematical Statistics and Applications, ed. by W. Grossmann, G. Pflug, I. Vincze, and W. Wertz, Dordrecht: Reidel Publishing Company, 283 -297.
• Rousseeuw, P.J. and Croux, C. (1993), "Alternatives to the Median Absolute Deviation," Journal of the American Statistical Association, 88, 1273 -1283.
• Rousseeuw, P.J. and Hubert, M. (1997), "Recent Developments in PROGRESS," in L1-Statistical Procedures and Related Topics, ed. by Y. Dodge, IMS Lecture Notes - Monograph Series, No. 31, 201 -214.
• Rousseeuw, P.J. and Leroy, A.M. (1987), Robust Regression and Outlier Detection, New York: John Wiley & Sons, Inc.
• Rousseeuw, P.J. and Van Driessen, K. (1997), A fast Algorithm for the Minimum Covariance Determinant Estimator,'' submitted for publication.
• Rousseeuw, P.J. and Van Zomeren, B.C. (1990), "Unmasking Multivariate Outliers and Leverage Points," Journal of the American Statistical Association, 85, 633 -639.

 Chapter Contents Previous Next Top