Adaptive Approach for Density-Approximating Neural Network Models for Anomaly Detection

We propose an alternative use of neural models in anomaly detection. Traditionally, in anomaly detection context the common use of neural models is in form of auto-encoders. Through the use of auto-encoders the true anomality is proxied by reconstruction error. Auto-encoders often perform well but do not guarantee to perform as expected in all cases. A popular more direct way of modeling anomality distribution is through k-Nearest Neighbor models. Although kNN can perform better than auto-encoders in some cases, their applicability can be seriously impaired by their space and time complexity especially with high-dimensional large-scale data. The alternative we propose is to model the distribution imposed by kNN using neural networks. We show that such neural models are capable of achieving comparable accuracy to kNN while reducing computational complexity by orders of magnitude. The de-noising effect of a neural model with limited number of neurons and layers is shown to lead to accuracy improvements in some cases. We evaluate the proposed idea against standard kNN and auto-encoders on a large set of benchmark data and show that in majority of cases it is possible to improve on accuracy or computational cost.

[1]  Martin T. Hagan,et al.  Neural network design , 1995 .

[2]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[4]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[5]  Zheng Zhang,et al.  HIDE : a Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification , 2001 .

[6]  Yu Cheng,et al.  Deep Structured Energy Based Models for Anomaly Detection , 2016, ICML.

[7]  Risto Miikkulainen,et al.  Intrusion Detection with Neural Networks , 1997, NIPS.

[8]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[9]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[10]  C. Loader Local Likelihood Density Estimation , 1996 .

[11]  Jon Louis Bentley,et al.  An Algorithm for Finding Best Matches in Logarithmic Expected Time , 1977, TOMS.

[12]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[13]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[14]  James Cannady,et al.  Artificial Neural Networks for Misuse Detection , 1998 .

[15]  Venkatesh Saligrama,et al.  Anomaly Detection with Score functions based on Nearest Neighbor Graphs , 2009, NIPS.

[16]  Andrew H. Sung,et al.  Intrusion detection using neural networks and support vector machines , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[17]  Jeffrey K. Uhlmann,et al.  Satisfying General Proximity/Similarity Queries with Metric Trees , 1991, Inf. Process. Lett..

[18]  Victor Ciesielski,et al.  Anomaly Detection Using Replicator Neural Networks Trained on Examples of One Class , 2014, SEAL.

[19]  Bernd Freisleben,et al.  CARDWATCH: a neural network based database mining system for credit card fraud detection , 1997, Proceedings of the IEEE/IAFE 1997 Computational Intelligence for Financial Engineering (CIFEr).

[20]  Raymond T. Ng,et al.  Distance-based outliers: algorithms and applications , 2000, The VLDB Journal.

[21]  Michael Schatz,et al.  Learning Program Behavior Profiles for Intrusion Detection , 1999, Workshop on Intrusion Detection and Network Monitoring.

[22]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[23]  Tomás Pevný,et al.  Loda: Lightweight on-line detector of anomalies , 2016, Machine Learning.

[24]  Christopher D. Brown,et al.  Receiver operating characteristics curves and related decision measures: A tutorial , 2006 .

[25]  Gunnar Rätsch,et al.  Kernel PCA and De-Noising in Feature Spaces , 1998, NIPS.

[26]  Paul Geladi,et al.  Principal Component Analysis , 1987, Comprehensive Chemometrics.

[27]  Thomas G. Dietterich,et al.  Systematic construction of anomaly detection benchmarks from real data , 2013, ODD '13.

[28]  Sung-Bae Cho,et al.  Evolutionary neural networks for anomaly detection based on the behavior of a program , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[29]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[30]  Bart Kosko,et al.  Neural networks and fuzzy systems: a dynamical systems approach to machine intelligence , 1991 .

[31]  John Langford,et al.  Cover trees for nearest neighbor , 2006, ICML.

[32]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[33]  R.J. Marks,et al.  Implicit learning in autoencoder novelty assessment , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[34]  S. T. Sarasamma,et al.  Hierarchical Kohonenen net for anomaly detection in network security , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).