Deep Weakly-supervised Anomaly Detection

Anomaly detection is typically posited as an unsupervised learning task in the literature due to the prohibitive cost and difficulty to obtain large-scale labeled anomaly data, but this ignores the fact that a very small number (e.g.,, a few dozens) of labeled anomalies can often be made available with small/trivial cost in many real-world anomaly detection applications. To leverage such labeled anomaly data, we study an important anomaly detection problem termed weakly-supervised anomaly detection, in which, in addition to a large amount of unlabeled data, a limited number of labeled anomalies are available during modeling. Learning with the small labeled anomaly data enables anomaly-informed modeling, which helps identify anomalies of interest and address the notorious high false positives in unsupervised anomaly detection. However, the problem is especially challenging, since (i) the limited amount of labeled anomaly data often, if not always, cannot cover all types of anomalies and (ii) the unlabeled data is often dominated by normal instances but has anomaly contamination. We address the problem by formulating it as a pairwise relation prediction task. Particularly, our approach defines a two-stream ordinal regression neural network to learn the relation of randomly sampled instance pairs, i.e., whether the instance pair contains two labeled anomalies, one labeled anomaly, or just unlabeled data instances. The resulting model effectively leverages both the labeled and unlabeled data to substantially augment the training data and learn well-generalized representations of normality and abnormality. Comprehensive empirical results on 40 real-world datasets show that our approach (i) significantly outperforms four state-of-the-art methods in detecting both of the known and previously unseen anomalies and (ii) is substantially more data-efficient.

[1]  Weixin Yao,et al.  Robust linear regression: A review and comparison , 2014, Commun. Stat. Simul. Comput..

[2]  Christos Faloutsos,et al.  SNARE: a link analytic system for graph labeling and risk detection , 2009, KDD.

[3]  Bo Zong,et al.  Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection , 2018, ICLR.

[4]  David Page,et al.  Area under the Precision-Recall Curve: Point Estimates and Confidence Intervals , 2013, ECML/PKDD.

[5]  Charu C. Aggarwal,et al.  Outlier Detection with Autoencoder Ensembles , 2017, SDM.

[6]  Hans-Peter Kriegel,et al.  Angle-based outlier detection in high-dimensional data , 2008, KDD.

[7]  Duen Horng Chau,et al.  Guilt by association: large scale malware detection by mining file-relation graphs , 2014, KDD.

[8]  Xiang Zhang,et al.  Text Understanding from Scratch , 2015, ArXiv.

[9]  Marius Kloft,et al.  Toward Supervised Anomaly Detection , 2014, J. Artif. Intell. Res..

[10]  Carla E. Brodley,et al.  FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection , 2012, Data Mining and Knowledge Discovery.

[11]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[13]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[14]  Jun Zhou,et al.  Anomaly Detection with Partially Observed Anomalies , 2018, WWW.

[15]  Zhi-Hua Zhou,et al.  Efficient Training for Positive Unlabeled Learning , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Xiaoli Li,et al.  Learning to Classify Texts Using Positive and Unlabeled Data , 2003, IJCAI.

[17]  Ke Zhang,et al.  A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data , 2009, PAKDD.

[18]  Alexander Binder,et al.  Deep Semi-Supervised Anomaly Detection , 2019, ICLR.

[19]  R. F. Woolson Wilcoxon Signed-Rank Test , 2008 .

[20]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[21]  Yiyuan She,et al.  Outlier Detection Using Nonconvex Penalized Regression , 2010, ArXiv.

[22]  Ruggero G. Pensa,et al.  A Semisupervised Approach to the Detection and Characterization of Outliers in Categorical Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[24]  Ling Chen,et al.  Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection , 2018, KDD.

[25]  Luis Perez,et al.  The Effectiveness of Data Augmentation in Image Classification using Deep Learning , 2017, ArXiv.

[26]  Jie Chen,et al.  Signaling Potential Adverse Drug Reactions from Administrative Health Databases , 2010, IEEE Transactions on Knowledge and Data Engineering.

[27]  Chuan Sheng Foo,et al.  Adversarially Learned Anomaly Detection , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[28]  Robert P. W. Duin,et al.  Support Vector Data Description , 2004, Machine Learning.

[29]  Charles Elkan,et al.  Learning classifiers from only positive and unlabeled data , 2008, KDD.

[30]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[31]  Jun Li,et al.  One-Class Adversarial Nets for Fraud Detection , 2018, AAAI.

[32]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[33]  Arthur Zimek,et al.  Subsampling for efficient and effective unsupervised outlier detection ensembles , 2013, KDD.

[34]  Hongxing He,et al.  Outlier Detection Using Replicator Neural Networks , 2002, DaWaK.

[35]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Anton van den Hengel,et al.  Deep Anomaly Detection with Deviation Networks , 2019, KDD.