Threaded ensembles of autoencoders for stream learning

Anomaly detection in streaming data is an important problem in numerous application domains. Most existing model‐based approaches to stream learning are based on decision trees due to their fast construction speed. This paper introduces streaming autoencoder (SA), a fast and novel anomaly detection algorithm based on ensembles of neural networks for evolving data streams. It is a one‐class learner, which only requires data from the positive class for training and is accurate even when anomalous training data are rare. It features an ensemble of threaded autoencoders with continuous learning capacity. Furthermore, the SA uses a 2‐step detection mechanism to ensure that real anomalies are detected with low false‐positive rates. The method is highly efficient because it processes data streams in parallel with multithreads and alternating buffers. Our analysis shows that SA has a linear runtime and requires constant memory space. Empirical comparisons to the state‐of‐the‐art methods on multiple benchmark data sets demonstrate that the proposed method detects anomalies efficiently with fewer false alarms.

[1]  Shen Furao,et al.  An enhanced self-organizing incremental neural network for online unsupervised learning , 2007, Neural Networks.

[2]  Zhi-Hua Zhou,et al.  Isolation Forest , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[3]  Bernhard Schölkopf,et al.  Support Vector Method for Novelty Detection , 1999, NIPS.

[4]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[5]  Piotr Duda,et al.  Decision Trees for Mining Data Streams Based on the McDiarmid's Bound , 2013, IEEE Transactions on Knowledge and Data Engineering.

[6]  Tony R. Martinez,et al.  The general inefficiency of batch training for gradient descent learning , 2003, Neural Networks.

[7]  Hao Huang,et al.  Streaming Anomaly Detection Using Randomized Matrix Sketching , 2015, Proc. VLDB Endow..

[8]  Gerhard Widmer,et al.  Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.

[9]  Sebastian Ruder,et al.  An overview of gradient descent optimization algorithms , 2016, Vestnik komp'iuternykh i informatsionnykh tekhnologii.

[10]  Philip S. Yu,et al.  RS-Forest: A Rapid Density Estimator for Streaming Anomaly Detection , 2014, 2014 IEEE International Conference on Data Mining.

[11]  Stephen Grossberg,et al.  Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps , 1992, IEEE Trans. Neural Networks.

[12]  Niall M. Adams,et al.  The impact of changing populations on classifier performance , 1999, KDD '99.

[13]  Piotr Duda,et al.  Decision Trees for Mining Data Streams Based on the Gaussian Approximation , 2014, IEEE Transactions on Knowledge and Data Engineering.

[14]  Philip S. Yu,et al.  Mining concept-drifting data streams using ensemble classifiers , 2003, KDD '03.

[15]  D. Brzezinski MINING DATA STREAMS WITH CONCEPT DRIFT , 2010 .

[16]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[17]  Vasant Honavar,et al.  Learn++: an incremental learning algorithm for supervised neural networks , 2001, IEEE Trans. Syst. Man Cybern. Part C.

[18]  Robi Polikar,et al.  Incremental Learning of Concept Drift in Nonstationary Environments , 2011, IEEE Transactions on Neural Networks.

[19]  Svetha Venkatesh,et al.  Anomaly detection in large-scale data stream networks , 2012, Data Mining and Knowledge Discovery.

[20]  Shen Furao,et al.  Self-Organizing Incremental Neural Network and Its Application , 2010, ICANN.

[21]  Stephen Grossberg,et al.  ARTMAP: supervised real-time learning and classification of nonstationary data by a self-organizing neural network , 1991, [1991 Proceedings] IEEE Conference on Neural Networks for Ocean Engineering.

[22]  Nathalie Japkowicz,et al.  A Novelty Detection Approach to Classification , 1995, IJCAI.

[23]  Miriam A. M. Capretz,et al.  Contextual anomaly detection framework for big sensor data , 2015, Journal of Big Data.

[24]  Bianca Zadrozny,et al.  Outlier detection by active learning , 2006, KDD '06.

[25]  João Gama,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[26]  Kai Ming Ting,et al.  Fast Anomaly Detection for Streaming Data , 2011, IJCAI.

[27]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[28]  Léon Bottou,et al.  Stochastic Gradient Descent Tricks , 2012, Neural Networks: Tricks of the Trade.

[29]  Stephen D. Bay,et al.  Mining distance-based outliers in near linear time with randomization and a simple pruning rule , 2003, KDD '03.

[30]  Yue Dong,et al.  Threaded Ensembles of Supervised and Unsupervised Neural Networks for Stream Learning , 2016, Canadian Conference on AI.

[31]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[32]  Sudipto Guha,et al.  Robust Random Cut Forest Based Anomaly Detection on Streams , 2016, ICML.

[33]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[34]  Yann LeCun,et al.  Large Scale Online Learning , 2003, NIPS.

[35]  Nathalie Japkowicz,et al.  Adaptability of the backpropagation procedure , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[36]  Salvatore J. Stolfo,et al.  One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses , 2003 .

[37]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[38]  Gaia Maselli Design and Implementation of an Anomaly Detection System: an Empirical Approach , 2003 .