Estimation of Contamination Parameters and Identification of Outliers in Multivariate Data

Multivariate outliers may be modeled using the contaminated multivariate normal distribution with two parameters indicating the percentage of outliers and the degree of contamination. Recent developments in elliptical distribution theory are used to determine estimators of these parameters. These estimators can be used with an index of Mahalanobis distance to identify the multivariate outliers, which can then be eliminated to obtain approximately normal data. The performance of the proposed estimators and outliers rejection procedures are evaluated in a small simulation study.