Identification of point-mass in multivariate samples

The procedures for the identification of outlier observations that are most reliable are based on the use of a robustified Mahalanobis distance, and have a very high computational cost even for small size problems. All these procedures present difficulties when applied to the identification of point-mass contaminations, where the outIiers are grouped into one or more clusters, separated from the sample. In this work a specific method for this contamination pattern is described, and shown to be able to handle successfully those cases where methods based on robust estimators (the Minimum Volume ElIipsiod estimator or the Stahel-Donoho estimator) fail. The method is simple, exploratory in nature, and straightforward to apply using any standard statistical software package.