Evaluating Real-Time Anomaly Detection Algorithms -- The Numenta Anomaly Benchmark

Much of the world's data is streaming, time-series data, where anomalies give significant information in critical situations, examples abound in domains such as finance, IT, security, medical, and energy. Yet detecting anomalies in streaming data is a difficult task, requiring detectors to process data in real-time, not batches, and learn while simultaneously making predictions. There are no benchmarks to adequately test and score the efficacy of real-time anomaly detectors. Here we propose the Numenta Anomaly Benchmark (NAB), which attempts to provide a controlled and repeatable environment of open-source tools to test and measure anomaly detection algorithms on streaming data. The perfect detector would detect all anomalies as soon as possible, trigger no false alarms, work with real-world time-series data across a variety of domains, and automatically adapt to changing statistics. Rewarding these characteristics is formalized in NAB, using a scoring algorithm designed for streaming data. NAB evaluates detectors on a benchmark dataset with labeled, real-world time-series data. We present these components, and give results and analyses for several open source, commercially-used algorithms. The goal for NAB is to provide a standard, open source framework with which the research community can compare and evaluate different algorithms for detecting anomalies in streaming data.

[1]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[2]  Saeed Amizadeh,et al.  Generic and Scalable Framework for Automated Time-series Anomaly Detection , 2015, KDD.

[3]  B. Rosner Percentage Points for a Generalized ESD Many-Outlier Procedure , 1983 .

[4]  J. Hawkins,et al.  On Intelligence , 2004 .

[5]  Ali A. Ghorbani,et al.  IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS 1 Toward Credible Evaluation of Anomaly-Based Intrusion-Detection Methods , 2022 .

[6]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[7]  Vipin Kumar,et al.  Comparative Evaluation of Anomaly Detection Techniques for Sequence Data , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[8]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[9]  Dipankar Dasgupta,et al.  Novelty detection in time series data using ideas from immunology , 1996 .

[10]  Pavlos Protopapas,et al.  Finding anomalous periodic time series , 2009, Machine Learning.

[11]  Edwin Lughofer,et al.  Fault detection in multi-sensor networks based on multivariate time-series models and orthogonal transformations , 2014, Inf. Fusion.

[12]  Pavlos Protopapas,et al.  Finding Anomalous Periodic Time Series: An Application to Catalogs of Periodic Variable Stars , 2009, arXiv.org.