Classifier-Adjusted Density Estimation for Anomaly Detection and One-Class Classification

Density estimation methods are often regarded as unsuitable for anomaly detection in high-dimensional data due to the difficulty of estimating multivariate probability distributions. Instead, the scores from popular distanceand localdensity-based methods, such as local outlier factor (LOF), are used as surrogates for probability densities. We question this infeasibility assumption and explore a family of simple statistically-based density estimates constructed by combining a probabilistic classifier with a naive density estimate. Across a number of semi-supervised and unsupervised problems formed from real-world data sets, we show that these methods are competitive with LOF and that even simple density estimates that assume attribute independence can perform strongly. We show that these density estimation methods scale well to data with high dimensionality and that they are robust to the problem of irrelevant attributes that plagues methods based on local estimates.

[1]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[2]  Salvatore J. Stolfo,et al.  Using artificial anomalies to detect unknown and known network intrusions , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[4]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[5]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[6]  Andrew W. Moore,et al.  Active Learning for Anomaly and Rare-Category Detection , 2004, NIPS.

[7]  Fabio A. González,et al.  Anomaly Detection Using Real-Valued Negative Selection , 2003, Genetic Programming and Evolvable Machines.

[8]  J. D. Malley,et al.  Probability Machines , 2011, Methods of Information in Medicine.

[9]  Takafumi Kanamori,et al.  Statistical outlier detection using direct density ratio estimation , 2011, Knowledge and Information Systems.

[10]  Emmanuel Müller,et al.  Statistical selection of relevant subspace projections for outlier ranking , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[11]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[12]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[13]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[14]  Hans-Peter Kriegel,et al.  A survey on unsupervised outlier detection in high‐dimensional numerical data , 2012, Stat. Anal. Data Min..

[15]  Kwang-Ho Ro,et al.  Outlier detection for high-dimensional data , 2015 .

[16]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[17]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[18]  James Theiler,et al.  Resampling approach for anomaly detection in multispectral images , 2003, SPIE Defense + Commercial Sensing.

[19]  Christopher M. Bishop,et al.  Novelty detection and neural network validation , 1994 .

[20]  Vipin Kumar,et al.  Feature bagging for outlier detection , 2005, KDD '05.

[21]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[22]  Tarn Duong,et al.  ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R , 2007 .

[23]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[24]  M. M. Moya,et al.  One-class classifier networks for target recognition applications , 1993 .

[25]  Pedro M. Domingos,et al.  Tree Induction for Probability-Based Ranking , 2003, Machine Learning.

[26]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[27]  Hans-Peter Kriegel,et al.  Outlier Detection in Axis-Parallel Subspaces of High Dimensional Data , 2009, PAKDD.

[28]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[29]  Jian Tang,et al.  Enhancing Effectiveness of Outlier Detections for Low Density Patterns , 2002, PAKDD.

[30]  Thomas G. Dietterich,et al.  Detecting insider threats in a real corporate database of computer usage activity , 2013, KDD.

[31]  Michael Brady,et al.  Novelty detection for the identification of masses in mammograms , 1995 .

[32]  Ian H. Witten,et al.  One-Class Classification by Combining Density and Class Probability Estimation , 2008, ECML/PKDD.

[33]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[34]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[35]  David J. Hand,et al.  CASOS: a subspace method for anomaly detection in high dimensional astronomical databases , 2013, Stat. Anal. Data Min..