RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection

Unsupervised anomaly detection (AD) plays a crucial role in many critical applications. Driven by the success of deep learning, recent years have witnessed growing interest in applying deep neural networks (DNNs) to AD problems. A common approach is using autoencoders to learn a feature representation for the normal observations in the data. The reconstruction error of the autoencoder is then used as outlier score to detect the anomalies. However, due to the high complexity brought upon by over-parameterization of DNNs, the reconstruction error of the anomalies could also be small, which hampers the effectiveness of these methods. To alleviate this problem, we propose a robust framework using collaborative autoencoders to jointly identify normal observations from the data while learning its feature representation. We investigate the theoretical properties of the framework and empirically show its outstanding performance as compared to other DNN-based methods. Empirical results also show resiliency of the framework to missing values compared to other baseline methods.

[1]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[2]  Li Fei-Fei,et al.  MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.

[3]  Meng Wang,et al.  Generative Adversarial Active Learning for Unsupervised Outlier Detection , 2018, IEEE Transactions on Knowledge and Data Engineering.

[4]  Yue Zhao,et al.  PyOD: A Python Toolbox for Scalable Outlier Detection , 2019, J. Mach. Learn. Res..

[5]  Yanyao Shen,et al.  Learning with Bad Training Data via Iterative Trimmed Loss Minimization , 2018, ICML.

[6]  Charu C. Aggarwal,et al.  Outlier Ensembles - An Introduction , 2017 .

[7]  Xingrui Yu,et al.  Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.

[8]  Thomas S. Huang,et al.  One-class SVM for learning in image retrieval , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[9]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[10]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[11]  Stan Matwin,et al.  Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , 2017, KDD.

[12]  Vatsal Shah,et al.  Choosing the Sample with Lowest Loss makes SGD Robust , 2020, AISTATS.

[13]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[14]  Sungzoon Cho,et al.  Variational Autoencoder based Anomaly Detection using Reconstruction Probability , 2015 .

[15]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[16]  Samy Bengio,et al.  Understanding deep learning requires rethinking generalization , 2016, ICLR.

[17]  Thomas G. Dietterich,et al.  A Meta-Analysis of the Anomaly Detection Problem , 2015 .

[18]  Pascal Vincent,et al.  Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion , 2010, J. Mach. Learn. Res..

[19]  Pourang Irani,et al.  2008 Eighth IEEE International Conference on Data Mining , 2008 .

[20]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..