Transfer Learning for Anomaly Detection through Localized and Unsupervised Instance Selection

Anomaly detection attempts to identify instances that deviate from expected behavior. Constructing performant anomaly detectors on real-world problems often requires some labeled data, which can be difficult and costly to obtain. However, often one considers multiple, related anomaly detection tasks. Therefore, it may be possible to transfer labeled instances from a related anomaly detection task to the problem at hand. This paper proposes a novel transfer learning algorithm for anomaly detection that selects and transfers relevant labeled instances from a source anomaly detection task to a target one. Then, it classifies target instances using a novel semi-supervised nearest-neighbors technique that considers both unlabeled target and transferred, labeled source instances. The algorithm outperforms a multitude of state-of-the-art transfer learning methods and unsupervised anomaly detection methods on a large benchmark. Furthermore, it outperforms its rivals on a real-world task of detecting anomalous water usage in retail stores.

[1]  Michelangelo Ceci,et al.  Exploiting transfer learning for the reconstruction of the human gene regulatory network , 2019, Bioinform..

[2]  Wouter M. Kouw An introduction to domain adaptation and transfer learning , 2018, ArXiv.

[3]  Vincent Vercruyssen,et al.  Semi-Supervised Anomaly Detection with an Application to Water Analytics , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[4]  Wannes Meert,et al.  Query Log Analysis: Detecting Anomalies in DNS Traffic at a TLD Resolver , 2018, DMLE/IOTSTREAMING@PKDD/ECML.

[5]  Maurizio Filippone,et al.  A comparative evaluation of outlier detection algorithms: Experiments and analyses , 2018, Pattern Recognit..

[6]  Vincent Vercruyssen,et al.  Transfer Learning for Time Series Anomaly Detection , 2017, IAL@PKDD/ECML.

[7]  Jing Zhang,et al.  Joint Geometrical and Statistical Alignment for Visual Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Lewis D. Griffin,et al.  Transfer representation-learning for anomaly detection , 2016, ICML 2016.

[9]  Seiichi Uchida,et al.  A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data , 2016, PloS one.

[10]  Arthur Zimek,et al.  On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study , 2016, Data Mining and Knowledge Discovery.

[11]  Kate Saenko,et al.  Return of Frustratingly Easy Domain Adaptation , 2015, AAAI.

[12]  Jesse Davis,et al.  TODTLER: Two-Order-Deep Transfer Learning , 2015, AAAI.

[13]  Philip S. Yu,et al.  A robust one-class transfer learning method with uncertain data , 2014, Knowledge and Information Systems.

[14]  Philip S. Yu,et al.  Transfer Joint Matching for Unsupervised Domain Adaptation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Philip S. Yu,et al.  Transfer Feature Learning with Joint Distribution Adaptation , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Thomas G. Dietterich,et al.  Systematic construction of anomaly detection benchmarks from real data , 2013, ODD '13.

[17]  Jieping Ye,et al.  Multisource domain adaptation and its application to early detection of fatigue , 2012, TKDD.

[18]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Andreas Dengel,et al.  Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm , 2012 .

[20]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[21]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[22]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[23]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[24]  Christos Faloutsos,et al.  LOCI: fast outlier detection using the local correlation integral , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[25]  Sridhar Ramaswamy,et al.  Efficient algorithms for mining outliers from large data sets , 2000, SIGMOD '00.

[26]  Hans-Peter Kriegel,et al.  LOF: identifying density-based local outliers , 2000, SIGMOD '00.

[27]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..