A Survey of Outlier Detection Methodologies

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.

[1]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[2]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[3]  Thomas G. Dietterich,et al.  Learning to Predict Sequences , 1985 .

[4]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[5]  Stephen Grossberg,et al.  A massively parallel architecture for a self-organizing neural pattern recognition machine , 1988, Comput. Vis. Graph. Image Process..

[6]  Thomas Jackson,et al.  Neural Computing - An Introduction , 1990 .

[7]  Edwina L. Rissland,et al.  Inductive Learning in a Mixed Paradigm Setting , 1990, AAAI.

[8]  Ryszard S. Michalski,et al.  Machine learning: an artificial intelligence approach volume III , 1990 .

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Philip H. S. Torr,et al.  Outlier detection and motion segmentation , 1993, Other Conferences.

[11]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[12]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[13]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[14]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[15]  Stephen J. Roberts,et al.  A Probabilistic Resource Allocating Network for Novelty Detection , 1994, Neural Computation.

[16]  Padhraic Smyth,et al.  Markov monitoring with unknown states , 1994, IEEE J. Sel. Areas Commun..

[17]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[18]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[19]  Dennis F. Kibler,et al.  Learning Prototypical Concept Descriptions , 1995, ICML.

[20]  Nathalie Japkowicz,et al.  A Novelty Detection Approach to Classification , 1995, IJCAI.

[21]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[22]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[23]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[24]  Dipankar Dasgupta,et al.  Novelty detection in time series data using ideas from immunology , 1996 .

[25]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[26]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[27]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[28]  Lucas C. Parra,et al.  Statistical Independence and Novelty Detection with Information Preserving Nonlinear Maps , 1996, Neural Computation.

[29]  Carla E. Brodley,et al.  Identifying and Eliminating Mislabeled Training Instances , 1996, AAAI/IAAI, Vol. 1.

[30]  Salvatore J. Stolfo,et al.  JAM: Java Agents for Meta-Learning over Distributed Databases , 1997, KDD.

[31]  Robert P. W. Duin,et al.  Novelty Detection Using Self-Organizing Maps , 1997, ICONIP.

[32]  T. Lane,et al.  Sequence Matching and Learning in Anomaly Detection for Computer Security , 1997 .

[33]  Christos Faloutsos,et al.  Quantifiable data mining using principal component analysis , 1997 .

[34]  Olli Simula,et al.  Enhancing SOM Based Data Visualization , 1998 .

[35]  A. Raftery,et al.  Nearest-Neighbor Clutter Removal for Estimating Features in Spatial Point Processes , 1998 .

[36]  T. Brotherton,et al.  Classification and novelty detection using linear models and a class dependent-elliptical basis function neural network , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[37]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[38]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[39]  Volker Tresp,et al.  Call-Based Fraud Detection in Mobile Communication Networks Using a Hierarchical Regime-Switching Model , 1998, NIPS.

[40]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[41]  Salvatore J. Stolfo,et al.  Mining Databases with Different Schemas: Integrating Incompatible Classifiers , 1998, KDD.

[42]  Stefan Wermter,et al.  Effectiveness of feature extraction in neural network architectures for novelty detection , 1999 .

[43]  Terran Lane,et al.  An Application of Machine Learning to Anomaly Detection , 1999 .

[44]  S. Roberts Novelty detection using extreme value statistics , 1999 .

[45]  Tom Fawcett,et al.  Activity monitoring: noticing interesting changes in behavior , 1999, KDD '99.

[46]  Lionel Tarassenko,et al.  A System for the Analysis of Jet Engine Vibration Data , 1999, Integr. Comput. Aided Eng..

[47]  L. Baker,et al.  A Hierarchical Probabilistic Model for Novelty Detection in Text , 1999, NIPS 1999.

[48]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[49]  Martti Juhola,et al.  Informal identification of outliers in medical data , 2000 .

[50]  Jim Austin,et al.  Novelty detection in airframe strain data , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[51]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[52]  O. Simula,et al.  The Self-organizing map as a tool in knowledge engineering , 2000 .

[53]  Marie B. Levine,et al.  Automated Event Detection in Space Instruments: A Case Study Using IPEX-2 Data and Support Vector Ma , 2000 .

[54]  Rob Saunders,et al.  Designing for Interest and Novelty Motivating Design Agents , 2001 .

[55]  John S. Gero A CURIOUS DESIGN AGENT A Computational Model of Novelty-Seeking Behaviour in Design , 2001 .

[56]  Stephen Marsland,et al.  On-Line Novelty Detection through self-organisation with application to inspection robotics , 2001 .

[57]  Philip S. Yu,et al.  Outlier detection for high dimensional data , 2001, SIGMOD '01.

[58]  Nikhil R. Pal Pattern Recognition in Soft Computing Paradigm , 2001 .

[59]  Shashi Shekhar,et al.  Detecting graph-based spatial outliers: algorithms and applications (a summary of results) , 2001, KDD '01.

[60]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[61]  Paul A. Crook,et al.  A Robot Implementation of a Biologically Inspired Method for Novelty Detection , 2002 .

[62]  Jim Austin,et al.  Novelty detection for strain-gauge degradation using maximally correlated components , 2002, ESANN.

[63]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.