Robust deep auto-encoding Gaussian process regression for unsupervised anomaly detection

Abstract Unsupervised anomaly detection (AD) is of great importance in both fundamental machine learning researches and industrial applications. Previous approaches have achieved great advance in improving the performance of unsupervised AD model recently. However, there are still some thorny issues unsolved, especially the problem of efficiency degradation when dealing with high-dimensional data and the inability to maintain robustness when dealing with contaminated data, which have not been addressed simultaneously in the existing models. In our work, we propose a novel hybrid unsupervised AD method, which first integrates convolutional auto-encoder and Gaussian process regression to extract features and to remove anomalies from noisy data as well. Our model behaves more effectively at modeling high-dimension data and more robust to variation of the anomaly rate in dataset. We evaluate its performance on four publicly benchmark datasets and show the state-of-the-art performance against competitive methods.

[1]  Daqiang Zhang,et al.  Novel clustering-based approach for Local Outlier Detection , 2016, 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[2]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[3]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD 2000.

[4]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[5]  Jalil Taghia,et al.  Insights Into Multiple/Single Lower Bound Approximation for Extended Variational Inference in Non-Gaussian Structured Data Modeling , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Francesco Cricri,et al.  Clustering and Unsupervised Anomaly Detection with l2 Normalized Deep Auto-Encoder Representations , 2018, 2018 International Joint Conference on Neural Networks (IJCNN).

[7]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Chen Jing,et al.  Fault detection based on a robust one class support vector machine , 2014, Neurocomputing.

[9]  Javam C. Machado,et al.  A Fault Detection Method for Hard Disk Drives Based on Mixture of Gaussians and Nonparametric Statistics , 2017, IEEE Transactions on Industrial Informatics.

[10]  Miguel Nicolau,et al.  One-Class Classification for Anomaly Detection with Kernel Density Estimation and Genetic Programming , 2016, EuroGP.

[11]  Seán F. McLoone,et al.  An enhanced variable selection and Isolation Forest based methodology for anomaly detection with OES data , 2018, Eng. Appl. Artif. Intell..

[12]  Antonio Criminisi,et al.  Harvesting Image Databases from the Web , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Bo Du,et al.  A Low-Rank and Sparse Matrix Decomposition-Based Mahalanobis Distance Method for Hyperspectral Anomaly Detection , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Yuan Zuo,et al.  An Anomaly Detection Framework Based on Autoencoder and Nearest Neighbor , 2018, 2018 15th International Conference on Service Systems and Service Management (ICSSSM).

[15]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[16]  Jun Guo,et al.  Variational Bayesian Learning for Dirichlet Process Mixture of Inverted Dirichlet Distributions in Non-Gaussian Image Feature Modeling , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Christopher Leckie,et al.  High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning , 2016, Pattern Recognit..

[18]  Malcolm I. Heywood,et al.  Smart Phone User Behaviour Characterization Based on Autoencoders and Self Organizing Maps , 2016, 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW).

[19]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[20]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[21]  Y. Heyden,et al.  Robust statistics in data analysis — A review: Basic concepts , 2007 .

[22]  Miguel Nicolau,et al.  A Hybrid Autoencoder and Density Estimation Model for Anomaly Detection , 2016, PPSN.

[23]  Marimuthu Palaniswami,et al.  Centered Hyperspherical and Hyperellipsoidal One-Class Support Vector Machines for Anomaly Detection in Sensor Networks , 2010, IEEE Transactions on Information Forensics and Security.

[24]  Meng Wang,et al.  Generative Adversarial Active Learning for Unsupervised Outlier Detection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[25]  Bernhard Schölkopf,et al.  Estimating the Support of a High-Dimensional Distribution , 2001, Neural Computation.

[26]  Barnabás Póczos,et al.  Hierarchical Probabilistic Models for Group Anomaly Detection , 2011, AISTATS.

[27]  Takehisa Yairi,et al.  Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction , 2014, MLSDA'14.

[28]  Muttukrishnan Rajarajan,et al.  A survey of intrusion detection techniques in Cloud , 2013, J. Netw. Comput. Appl..

[29]  Cheng-Lin Liu,et al.  Anomaly Detection via Minimum Likelihood Generative Adversarial Networks , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[30]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[31]  Jürgen Schmidhuber,et al.  Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[32]  En Zhu,et al.  Deep Clustering with Convolutional Autoencoders , 2017, ICONIP.

[33]  Gang Hua,et al.  Learning Discriminative Reconstructions for Unsupervised Outlier Removal , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[35]  Miguel Nicolau,et al.  Learning Neural Representations for Network Anomaly Detection , 2019, IEEE Transactions on Cybernetics.

[36]  Qi Shi,et al.  A Deep Learning Approach to Network Intrusion Detection , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[37]  Inseok Hwang,et al.  A Survey of Fault Detection, Isolation, and Reconfiguration Methods , 2010, IEEE Transactions on Control Systems Technology.

[38]  Douglas M. Hawkins Identification of Outliers , 1980, Monographs on Applied Probability and Statistics.

[39]  Defeng Wang,et al.  Structured One-Class Classification , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[40]  Alyani Ismail,et al.  A New Intrusion Detection System Based on Fast Learning Network and Particle Swarm Optimization , 2018, IEEE Access.

[41]  Slim Abdennadher,et al.  Enhancing one-class support vector machines for unsupervised anomaly detection , 2013, ODD '13.

[42]  Malcolm I. Heywood,et al.  Data analytics on network traffic flows for botnet behaviour detection , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[43]  Gang Hua,et al.  Unsupervised One-Class Learning for Automatic Outlier Removal , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Clayton D. Scott,et al.  Robust kernel density estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[45]  Huangang Wang,et al.  Robust one-class SVM for fault detection , 2016 .

[46]  Hans-Peter Kriegel,et al.  Generalized Outlier Detection with Flexible Kernel Density Estimates , 2014, SDM.

[47]  D. Hand,et al.  Bayesian anomaly detection methods for social networks , 2010, 1011.1788.

[48]  Vanish Talwar,et al.  Statistical techniques for online anomaly detection in data centers , 2011, 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops.