ConNet: Deep Semi-Supervised Anomaly Detection Based on Sparse Positive Samples

Existing semi-supervised anomaly detection methods usually use a large amount of labeled normal data for training, which have the problem of high labeling costs. Only a few semi-supervised methods utilize unlabeled data and a few labeled anomalies to train models. However, these kinds of methods usually encounter two problems: (i) since anomalies usually have different behavior patterns or the internal mechanisms that produce anomalies are complex and diverse, a few labeled anomalies cannot cover all anomaly types; and (ii) the amount of unlabeled data in the training set is substantially greater than the amount of labeled data, which leads to that unlabeled data with contamination often dominates the training process. To solve these two problems, we propose the semi-supervised anomaly detection method named ConNet and a new loss function named concentration loss. Specifically, ConNet consists of two stages. Firstly, we obtain the prior anomaly score of unlabeled data via prior estimation module and attach the prior anomaly score to unlabeled data as the training weight. Then, an anomaly scoring network is training to assign anomaly scores to data instances, which can ensure that the anomaly scores of anomalies significantly deviate from those of normal instances. We have conducted experiments on thirteen real-world data sets and tested the performance of our method in terms of detection accuracy, utilization efficiency of labeled data, and robustness to different contamination rates. The experimental results show that the performance of our method is significantly better than those of the state-of-the-art anomaly detection methods.

[1]  Jesse Davis,et al.  Learning from positive and unlabeled data: a survey , 2018, Machine Learning.

[2]  Biao Huang,et al.  KNN Based Outlier Detection Algorithm in Large Dataset , 2008, 2008 International Workshop on Education Technology and Training & 2008 International Workshop on Geoscience and Remote Sensing.

[3]  Mahmood Fathy,et al.  Adversarially Learned One-Class Classifier for Novelty Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Ke Zhang,et al.  A New Local Distance-Based Outlier Detection Approach for Scattered Real-World Data , 2009, PAKDD.

[5]  Anton van den Hengel,et al.  Deep Anomaly Detection with Deviation Networks , 2019, KDD.

[6]  Jun Zhou,et al.  Anomaly Detection with Partially Observed Anomalies , 2018, WWW.

[7]  Borko Furht,et al.  Sensor fault and patient anomaly detection and classification in medical wireless sensor networks , 2013, 2013 IEEE International Conference on Communications (ICC).

[8]  Georg Langs,et al.  Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery , 2017, IPMI.

[9]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[10]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[11]  Carla E. Brodley,et al.  FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection , 2012, Data Mining and Knowledge Discovery.

[12]  Philip S. Yu,et al.  Partially Supervised Classification of Text Documents , 2002, ICML.

[13]  Fei Tony Liu,et al.  Isolation-Based Anomaly Detection , 2012, TKDD.

[14]  Chuan Sheng Foo,et al.  Adversarially Learned Anomaly Detection , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[15]  Nicu Sebe,et al.  Learning Deep Representations of Appearance and Motion for Anomalous Event Detection , 2015, BMVC.

[16]  Karsten M. Borgwardt,et al.  Rapid Distance-Based Outlier Detection via Sampling , 2013, NIPS.

[17]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[18]  Alexander Binder,et al.  Deep Semi-Supervised Anomaly Detection , 2019, ICLR.

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[20]  Chandan Srivastava,et al.  Support Vector Data Description , 2011 .

[21]  Hongxing He,et al.  Outlier Detection Using Replicator Neural Networks , 2002, DaWaK.

[22]  Marius Kloft,et al.  Toward Supervised Anomaly Detection , 2014, J. Artif. Intell. Res..

[23]  Aidong Men,et al.  A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data , 2017, Comput. Intell. Neurosci..

[24]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[25]  Toby P. Breckon,et al.  GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training , 2018, ACCV.

[26]  Zengyou He,et al.  Discovering cluster-based local outliers , 2003, Pattern Recognit. Lett..

[27]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Philip S. Yu,et al.  Building text classifiers using positive and unlabeled examples , 2003, Third IEEE International Conference on Data Mining.

[29]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[30]  Kate Smith-Miles,et al.  A Comprehensive Survey of Data Mining-based Fraud Detection Research , 2010, ArXiv.

[31]  Ruggero G. Pensa,et al.  A Semisupervised Approach to the Detection and Characterization of Outliers in Categorical Data , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[32]  Ling Chen,et al.  Learning Representations of Ultrahigh-dimensional Data for Random Distance-based Outlier Detection , 2018, KDD.

[33]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[34]  Charu C. Aggarwal,et al.  Outlier Detection with Autoencoder Ensembles , 2017, SDM.

[35]  Alexander Binder,et al.  Deep One-Class Classification , 2018, ICML.

[36]  B. Ravi Kiran,et al.  An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos , 2018, J. Imaging.

[37]  Randy C. Paffenroth,et al.  Anomaly Detection with Robust Deep Autoencoders , 2017, KDD.