Comparative Analysis of Outlier Detection Techniques

Mining simply refers to the extraction of very interesting patterns of the data from the massive data sets. Outlier detection is one of the important aspects of data mining which actually finds out the observations that are deviating from the common expected behavior. Outlier detection and analysis is sometimes known as outlier mining. In this paper, we have tried to provide the broad and a comprehensive literature survey of outliers and outlier detection techniques under one roof, so as to explain the richness and complexity associated with each outlier detection technique. Moreover, we have also given a broad comparison of the various methods of the different outlier techniques.

[1]  Tok Wang Ling,et al.  HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data , 2004, VLDB.

[2]  Sanjay Chawla,et al.  Mining for Outliers in Sequential Databases , 2006, SDM.

[3]  Paul A. Crook,et al.  A Robot Implementation of a Biologically Inspired Method for Novelty Detection , 2002 .

[4]  Shuchita Upadhyaya,et al.  Outlier Detection: Applications And Techniques , 2012 .

[5]  Philip S. Yu,et al.  An effective and efficient algorithm for high-dimensional outlier detection , 2005, The VLDB Journal.

[6]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[7]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[8]  Peter J. Rousseeuw,et al.  Robust regression and outlier detection , 1987 .

[9]  Zhengxin Chen,et al.  A Descriptive Framework for the Field of Data Mining and Knowledge Discovery , 2008, Int. J. Inf. Technol. Decis. Mak..

[10]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[11]  Stephen J. Roberts,et al.  Novelty, confidence and errors in connectionist systems , 1996 .

[12]  Bernd Freisleben,et al.  CARDWATCH: a neural network based database mining system for credit card fraud detection , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[13]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[14]  Jim Austin,et al.  Novelty detection in airframe strain data , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[15]  M. V. Bhat,et al.  An Efficient Clustering Algorithm , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[16]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[17]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[18]  Durvasula V. L. N. Somayajulu,et al.  Privacy Preserving Outlier Detection Using Hierarchical Clustering Methods , 2010, 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops.

[19]  James Theiler,et al.  Resampling approach for anomaly detection in multispectral images , 2003, SPIE Defense + Commercial Sensing.

[20]  Chun Zhang,et al.  Storing and querying ordered XML using a relational database system , 2002, SIGMOD '02.

[21]  Martti Juhola,et al.  Informal identification of outliers in medical data , 2000 .

[22]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[23]  Raymond T. Ng,et al.  Algorithms for Mining Distance-Based Outliers in Large Datasets , 1998, VLDB.

[24]  A. Raftery,et al.  Nearest-Neighbor Clutter Removal for Estimating Features in Spatial Point Processes , 1998 .

[25]  Chang-Tien Lu,et al.  Spatial Outlier Detection: A Graph-Based Approach , 2007, 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007).

[26]  Anthony K. H. Tung,et al.  Ranking Outliers Using Symmetric Neighborhood Relationship , 2006, PAKDD.

[27]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[28]  Don R. Hush,et al.  A Classification Framework for Anomaly Detection , 2005, J. Mach. Learn. Res..

[29]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[30]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[31]  D. Hand,et al.  Unsupervised Profiling Methods for Fraud Detection , 2002 .

[32]  Sushil Jajodia,et al.  Applications of Data Mining in Computer Security , 2002, Advances in Information Security.

[33]  Ji Zhang,et al.  Towards outlier detection for high-dimensional data streams using projected outlier analysis strategy , 2009 .

[34]  José R. Dorronsoro,et al.  Neural fraud detection in credit card operations , 1997, IEEE Trans. Neural Networks.

[35]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[36]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .

[37]  T. Brotherton,et al.  Classification and novelty detection using linear models and a class dependent-elliptical basis function neural network , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[38]  Chang-Tien Lu,et al.  Spatial Outlier Detection: A Graph-Based Approach , 2007 .

[39]  W. R. Buckland,et al.  Outliers in Statistical Data , 1979 .

[40]  Eleazar Eskin,et al.  A GEOMETRIC FRAMEWORK FOR UNSUPERVISED ANOMALY DETECTION: DETECTING INTRUSIONS IN UNLABELED DATA , 2002 .