Improve black-box sequential anomaly detector relevancy with limited user feedback

Anomaly detectors are often designed to catch statistical anomalies. End-users typically do not have interest in all of the detected outliers, but only those relevant to their application. Given an existing black-box sequential anomaly detector, this paper proposes a method to improve its user relevancy using a small number of human feedback. As our first contribution, the method is agnostic to the detector: it only assumes access to its anomaly scores, without requirement on any additional information inside it. Inspired by a fact that anomalies are of different types, our approach identifies these types and utilizes user feedback to assign relevancy to types. This relevancy score, as our second contribution, is used to adjust the subsequent anomaly selection process. Empirical results on synthetic and real-world datasets show that our approach yields significant improvements on precision and recall over a range of anomaly detectors.

[1]  Gang He,et al.  Anomaly Detection for Key Performance Indicators Through Machine Learning , 2018, 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC).

[2]  Subramanian Ramanathan,et al.  Active Online Anomaly Detection Using Dirichlet Process Mixture Model and Gaussian Process Classification , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[4]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[5]  Jack W. Stokes,et al.  Detecting Cyber Attacks Using Anomaly Detection with Explanations and Expert Feedback , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Vincent Vercruyssen,et al.  Semi-Supervised Anomaly Detection with an Application to Water Analytics , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[7]  Chris Chatfield,et al.  The Holt-Winters Forecasting Procedure , 1978 .

[8]  Tomás Pevný,et al.  Loda: Lightweight on-line detector of anomalies , 2016, Machine Learning.

[9]  Thomas G. Dietterich,et al.  Feedback-Guided Anomaly Discovery via Online Optimization , 2018, KDD.

[10]  Sudipto Guha,et al.  Robust Random Cut Forest Based Anomaly Detection on Streams , 2016, ICML.

[11]  Thomas G. Dietterich,et al.  Incorporating Feedback into Tree-based Anomaly Detection , 2017, ArXiv.

[12]  Graeme Chiew,et al.  Anomaly detection in cyber security attacks on networks using MLP deep learning , 2018, 2018 International Conference on Smart Computing and Electronic Enterprise (ICSCEE).

[13]  Roummel F. Marcia,et al.  Sequential Anomaly Detection in the Presence of Noise and Limited Feedback , 2009, IEEE Transactions on Information Theory.

[14]  Anders Høst-Madsen,et al.  Data Discovery and Anomaly Detection Using Atypicality for Real-Valued Data , 2019, Entropy.

[15]  Valentin Flunkert,et al.  DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks , 2017, International Journal of Forecasting.

[16]  Rebecca Willett,et al.  Online anomaly detection with expert system feedback in social networks , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Thomas G. Dietterich,et al.  Incorporating Expert Feedback into Active Anomaly Discovery , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).