Online Anomaly Detection of Time Series at Scale

Cyber breaches can result in disruption to business operations, reputation damage as well as directly affecting the financial stability of the targeted corporations, with potential impacts on future profits and stock values. Automatic network-stream monitoring becomes necessary for cyber situation awareness, and time-series anomaly detection plays an important role in network stream monitoring. This study surveyed recent research on time-series analysis methods in respect of parametric and non-parametric techniques, and popular machine learning platforms for data analysis on streaming data on both single server and cloud computing environments. We believe it provides a good reference for researchers in both academia and industry to select suitable (time series) data analysis techniques, and computing platforms, dependent on the data scale and real-time requirements.

[1]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[2]  Irma J. Terpenning,et al.  STL : A Seasonal-Trend Decomposition Procedure Based on Loess , 1990 .

[3]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[4]  Wu Jia,et al.  Hierarchical Temporal Memory Method for Time-Series-Based Anomaly Detection , 2016 .

[5]  Li Xu,et al.  Online Internet traffic monitoring system using spark streaming , 2018, Big Data Min. Anal..

[6]  Saad B. Qaisar,et al.  One-class support vector machines: analysis of outlier detection for wireless sensor networks in harsh environments , 2013, Artificial Intelligence Review.

[7]  Otmar Ertl,et al.  Computing Extremely Accurate Quantiles Using t-Digests , 2019, ArXiv.

[8]  Arun Kejariwal,et al.  On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data , 2017, ArXiv.

[9]  Suku Nair,et al.  A Predictive Framework for Cyber Security Analytics using Attack Graphs , 2015, ArXiv.

[10]  Youngseok Lee,et al.  An Internet traffic analysis method with MapReduce , 2010, 2010 IEEE/IFIP Network Operations and Management Symposium Workshops.

[11]  T. Cipra Statistical Analysis of Time Series , 2010 .

[12]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[13]  Vic Barnett,et al.  Outliers in Statistical Data , 1980 .

[14]  Sameer Singh,et al.  Novelty detection: a review - part 1: statistical approaches , 2003, Signal Process..

[15]  Igor V. Kotenko,et al.  Analytical modeling and assessment of cyber resilience on the base of stochastic networks conversion , 2018, 2018 10th International Workshop on Resilient Networks Design and Modeling (RNDM).

[16]  Phyks Introducing practical and robust anomaly detection in a time series | Twitter Blogs , 2015 .

[17]  Bernhard Schölkopf,et al.  Support Vector Machines as Probabilistic Models , 2011, ICML.

[18]  Edward Arbon,et al.  Anomaly Detection in Satellite Communications Networks using Support Vector Machines , 2015 .

[19]  Zengchang Qin,et al.  A k-hyperplane-based neural network for non-linear regression , 2010, 9th IEEE International Conference on Cognitive Informatics (ICCI'10).

[20]  Evangelos Spiliotis,et al.  Statistical and Machine Learning forecasting methods: Concerns and ways forward , 2018, PloS one.

[21]  Marimuthu Palaniswami,et al.  Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..

[22]  Zirije Hasani,et al.  Robust anomaly detection algorithms for real-time big data: Comparison of algorithms , 2017, 2017 6th Mediterranean Conference on Embedded Computing (MECO).

[23]  Mark Kasunic,et al.  An Investigation of Techniques for Detecting Data Anomalies in Earned Value Management Data , 2011 .

[24]  Maurizio Filippone,et al.  A comparative evaluation of outlier detection algorithms: Experiments and analyses , 2018, Pattern Recognit..

[25]  Gianluca Bontempi,et al.  Machine Learning Strategies for Time Series Forecasting , 2012, eBISS.

[26]  Nidhi Singh,et al.  Demystifying Numenta anomaly benchmark , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[27]  Ira Cohen,et al.  Real-time anomaly detection system for time series at scale , 2017, ADF@KDD.

[28]  Spyros Makridakis,et al.  The M3-Competition: results, conclusions and implications , 2000 .

[29]  Patcharee Thongtra,et al.  Time-Series Data Analytics Using Spark and Machine Learning , 2017, ISMIS.

[30]  Francisco Herrera,et al.  Big data preprocessing: methods and prospects , 2016 .