Identification of multivariate outliers – problems and challenges of visualization methods

The identification of outliers is often thought of as a means to eliminate observations from a data set to avoid disturbance in further analyses. But outliers may as well be the interesting observations in themselves, because they can give us hints about certain structures in the data or about special events during the sampling period. Therefore, appropriate methods for the detection of outliers are needed. Literature is abundant with procedures for detection and testing of single outliers in sample data. The difficulty of detection increases with the number of outliers and the dimension of the data because the outliers can be extreme in any growing number of directions. An overview of multivariate outlier detection methods that are provided in this study because of its growing importance in a wide variety of practical situations. We focus on methods that can be visually presented.

[1]  M. Jhun,et al.  Asymptotics for the minimum covariance determinant estimator , 1993 .

[2]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[3]  P. Rousseeuw,et al.  The Bagplot: A Bivariate Boxplot , 1999 .

[4]  H. Caussinus,et al.  Interesting Projections of Multidimensional Data by Means of Generalized Principal Component Analyses , 1990 .

[5]  Irad Ben-Gal Outlier Detection , 2005, The Data Mining and Knowledge Discovery Handbook.

[6]  Christophe Croux,et al.  High breakdown estimators for principal components: the projection-pursuit approach revisited , 2005 .

[7]  Boris Iglewicz,et al.  Outlier detection using robust measures of scale , 1982 .

[8]  J RousseeuwPeter,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[9]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[10]  Ursula Gather,et al.  The Masking Breakdown Point of Multivariate Outlier Identification Rules , 1999 .

[11]  Katrien van Driessen,et al.  A Fast Algorithm for the Minimum Covariance Determinant Estimator , 1999, Technometrics.

[12]  P. Rousseeuw,et al.  A fast algorithm for the minimum covariance determinant estimator , 1999 .

[13]  Rob J. Hyndman,et al.  Robust forecasting of mortality and fertility rates: A functional data approach , 2007, Comput. Stat. Data Anal..

[14]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[15]  David L. Woodruff,et al.  Identification of Outliers in Multivariate Data , 1996 .

[16]  David E. Booth,et al.  A Robust Multivariate Procedure for the Identification of Problem Savings and Loan Institutions , 1989 .

[17]  Rob J Hyndman,et al.  Rainbow Plots, Bagplots, and Boxplots for Functional Data , 2010 .

[18]  P. Rousseeuw,et al.  Unmasking Multivariate Outliers and Leverage Points , 1990 .

[19]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[20]  Herman Aguinis,et al.  Best-Practice Recommendations for Defining, Identifying, and Handling Outliers , 2013 .

[21]  Tom Fawcett,et al.  Adaptive Fraud Detection , 1997, Data Mining and Knowledge Discovery.

[22]  Kay I Penny,et al.  A comparison of multivariate outlier detection methods for clinical laboratory safety data , 2001 .

[23]  Steven J. Schwager,et al.  Detection of Multivariate Normal Outliers , 1982 .

[24]  Josef Schmee,et al.  Outliers in Statistical Data (2nd ed.) , 1986 .

[25]  Peter Filzmoser,et al.  Outlier identification in high dimensions , 2008, Comput. Stat. Data Anal..

[26]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[27]  E. Acuña,et al.  A Meta analysis study of outlier detection methods in classification , 2004 .

[28]  A. Hadi Identifying Multiple Outliers in Multivariate Data , 1992 .

[29]  Francisco J. Prieto,et al.  Multivariate Outlier Detection and Robust Covariance Matrix Estimation , 2001, Technometrics.

[30]  P. Rousseeuw Multivariate estimation with high breakdown point , 1985 .