Self-organizing maps for outlier detection

In this paper we address the problem of multivariate outlier detection using the (unsupervised) self-organizing map (SOM) algorithm introduced by Kohonen. We examine a number of techniques, based on summary statistics and graphics derived from the trained SOM, and conclude that they work well in cooperation with each other. Useful tools include the median interneuron distance matrix and the projection ofthe trained map (via Sammon's projection). SOM quantization errors provide an important complementary source of information for certain type of outlying behavior. Empirical results are reported on both artificial and real data.

[1]  Vladimir Cherkassky,et al.  Self-Organizing Networks for Nonparametric Regression , 1994 .

[2]  F. Murtagh,et al.  The Kohonen self-organizing map method: An assessment , 1995 .

[3]  Bell Telephone,et al.  ROBUST ESTIMATES, RESIDUALS, AND OUTLIER DETECTION WITH MULTIRESPONSE DATA , 1972 .

[4]  N. L. Johnson,et al.  Multivariate Analysis , 1958, Nature.

[5]  Teuvo Kohonen,et al.  Self-organization and associative memory: 3rd edition , 1989 .

[6]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[7]  Teuvo Kohonen,et al.  Self-Organization and Associative Memory , 1988 .

[8]  G. Tattershall Neural map applications , 1989 .

[9]  Ramesh C. Jain,et al.  A robust backpropagation learning algorithm for function approximation , 1994, IEEE Trans. Neural Networks.

[10]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[11]  Klaus Schulten,et al.  Kohonens Self-Organizing Maps for Modeling the Formation of the Auditory Cortex of a Bat , 1988 .

[12]  I. H. Öğüş,et al.  NATO ASI Series , 1997 .

[13]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[14]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[15]  Frederick Mosteller,et al.  Understanding robust and exploratory data analysis , 1983 .

[16]  A. Hadi Identifying Multiple Outliers in Multivariate Data , 1992 .

[17]  Samuel Kaski,et al.  Visualizing the Clusters on the Self-Organizing Map , 1994 .

[18]  J. Muruzabal,et al.  Topology-based genetic search for the Stahel-Donoho estimator , 1995, Proceedings of 1995 IEEE International Conference on Evolutionary Computation.

[19]  Vladimir Cherkassky,et al.  Constrained topological mapping for nonparametric regression analysis , 1991, Neural Networks.

[20]  James C. Bezdek,et al.  An index of topological preservation for feature extraction , 1995, Pattern Recognit..

[21]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[22]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[23]  Gautam Biswas,et al.  Evaluation of Projection Algorithms , 1981, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Danny Coomans,et al.  Comparative Performance Analysis of Non-Linear Dimensionality Reduction Methods , 1995 .

[25]  Marco Botta,et al.  SMART+: A Multi-Strategy Learning Tool , 1993, IJCAI.

[26]  Fionn Murtagh,et al.  Interpreting the Kohonen self-organizing feature map using contiguity-constrained clustering , 1995, Pattern Recognit. Lett..

[27]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[28]  R. Pfeifer,et al.  Connectionism in Perspective , 1989 .

[29]  Gilles Pagès,et al.  Two or three things that we know about the Kohonen algorithm , 1994, ESANN.

[30]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[31]  Marco Botta,et al.  Learning Relations: An Evaluation of Search Strategies , 1993, Fundam. Informaticae.

[32]  A. Atkinson Fast Very Robust Methods for the Detection of Multiple Outliers , 1994 .

[33]  Anil K. Jain,et al.  A nonlinear projection method based on Kohonen's topology preserving maps , 1992, IEEE Trans. Neural Networks.

[34]  David M. Rocke,et al.  Heuristic Search Algorithms for the Minimum Volume Ellipsoid , 1993 .

[35]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[36]  A. V.DavidSánchez,et al.  Robustization of a learning method for RBF networks , 1995, Neurocomputing.

[37]  Victor J. Yohai,et al.  The Behavior of the Stahel-Donoho Robust Multivariate Estimator , 1995 .

[38]  Alfred Ultsch,et al.  Knowledge Extraction from Self-Organizing Neural Networks , 1993 .

[39]  Igor Aleksander,et al.  Neural computing architectures: the design of brain-like machines , 1989 .

[40]  Anil K. Jain,et al.  Artificial neural networks for feature extraction and multivariate data projection , 1995, IEEE Trans. Neural Networks.

[41]  V. Yohai,et al.  The Detection of Influential Subsets in Linear Regression by Using an Influence Matrix , 1995 .

[42]  Alan L. Yuille,et al.  Robust principal component analysis by self-organizing rules based on statistical physics approach , 1995, IEEE Trans. Neural Networks.