Identifying data streams anomalies by evolving spiking restricted Boltzmann machines

Data streams are characterized by high volatility, and they drastically change in an unpredictable way over time. In the typical case, newer data are the most important, as the concept of aging is based on their timing. These flows require real-time processing in order to extract meaningful information that will allow for essential and targeted responses to changing circumstances. Knowledge mining is a real-time process performed on a subset of the data streams, which contains a small but recent part of the observations. Timely security requirements call for further quest of optimal approaches, capable of improving the reliability and the accuracy of the employed classifiers. This research introduces a real-time evolving spiking restricted Boltzmann machine approach, for efficient anomaly detection in data streams. Testing has proved that the proposed algorithm maximizes the classification accuracy and at the same time minimizes the computational resources requirements. A comparative analysis has shown that it outperforms other data flow analysis algorithms.

[1]  Konstantinos Demertzis,et al.  An innovative soft computing system for smart energy grids cybersecurity , 2018 .

[2]  Filip Ponulak,et al.  Introduction to spiking neural networks: Information processing, learning and applications. , 2011, Acta neurobiologiae experimentalis.

[3]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[4]  Yong Shi,et al.  Categorizing and mining concept drifting data streams , 2008, KDD.

[5]  A. Bifet,et al.  Early Drift Detection Method , 2005 .

[6]  Hosik Choi,et al.  A Classifier Ensemble for Concept Drift Using a Constrained Penalized Regression Combiner , 2016 .

[7]  L. Iliadis,et al.  Ladon: A Cyber-Threat Bio-Inspired Intelligence Management System , 2016 .

[8]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2011, Math. Program..

[9]  Xin Yao,et al.  The Impact of Diversity on Online Ensemble Learning in the Presence of Concept Drift , 2010, IEEE Transactions on Knowledge and Data Engineering.

[10]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jian-Wei Liu,et al.  Contrastive divergence learning for the Restricted Boltzmann Machine , 2013, 2013 Ninth International Conference on Natural Computation (ICNC).

[12]  Konstantinos Demertzis,et al.  A Hybrid Network Anomaly and Intrusion Detection Approach Based on Evolving Spiking Neural Network Classification , 2013, e-Democracy.

[13]  P. S. Sastry,et al.  An Overview of Restricted Boltzmann Machines , 2019, Journal of the Indian Institute of Science.

[14]  Wei Gao,et al.  Industrial Control System Traffic Data Sets for Intrusion Detection Research , 2014, Critical Infrastructure Protection.

[15]  G. G. Meyer,et al.  Lecture notes in business information processing , 2009 .

[16]  Heiko Wersing,et al.  KNN Classifier with Self Adjusting Memory for Heterogeneous Concept Drift , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[17]  Clare Stanier,et al.  Towards Differentiating Business Intelligence, Big Data, Data Analytics and Knowledge Discovery , 2016, ERP Future.

[18]  Bhabesh Nath,et al.  Mining patterns from data streams: An overview , 2017, 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC).

[19]  Cees T. A. M. de Laat,et al.  Defining architecture components of the Big Data Ecosystem , 2014, 2014 International Conference on Collaboration Technologies and Systems (CTS).

[20]  Scott D. Brown,et al.  A simple introduction to Markov Chain Monte–Carlo sampling , 2016, Psychonomic bulletin & review.

[21]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[22]  Talel Abdessalem,et al.  Adaptive random forests for evolving data stream classification , 2017, Machine Learning.

[23]  João Gama,et al.  Evaluation of recommender systems in streaming environments , 2015, ArXiv.

[24]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[25]  Konstantinos Demertzis,et al.  Evolving Computational Intelligence System for Malware Detection , 2014, CAiSE Workshops.

[26]  Stefan Schliebs,et al.  Evolving spiking neural network—a survey , 2013, Evolving Systems.

[27]  Konstantinos Demertzis,et al.  The Next Generation Cognitive Security Operations Center: Network Flow Forensics Using Cybersecurity Intelligence , 2018, Big Data Cogn. Comput..

[28]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[29]  Enrico Zio,et al.  A Novel Concept Drift Detection Method for Incremental Learning in Nonstationary Environments , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[31]  Konstantinos Demertzis,et al.  MOLESTRA: A Multi-Task Learning Approach for Real-Time Big Data Analytics , 2018, 2018 Innovations in Intelligent Systems and Applications (INISTA).

[32]  Geoff Holmes,et al.  Leveraging Bagging for Evolving Data Streams , 2010, ECML/PKDD.

[33]  Fu Jie Huang,et al.  A Tutorial on Energy-Based Learning , 2006 .

[34]  Inder Monga,et al.  Lambda architecture for cost-effective batch and speed big data processing , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[35]  Donald Geman,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1984 .

[36]  Ayoub Ait Lahcen,et al.  An overview of big data opportunities, applications and tools , 2015, 2015 Intelligent Systems and Computer Vision (ISCV).

[37]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[38]  Konstantinos Demertzis,et al.  MOLESTRA : A MultiTask Learning Approach for Real-Time Big Data Analytics , 2018 .

[39]  Li Zhang,et al.  An adaptive ensemble classifier for mining concept drifting data streams , 2013, Expert Syst. Appl..

[40]  Geoff Holmes,et al.  Evaluation methods and decision theory for classification of streaming data with temporal dependence , 2015, Machine Learning.

[41]  Shifei Ding,et al.  An overview on Restricted Boltzmann Machines , 2018, Neurocomputing.

[42]  Konstantinos Demertzis,et al.  A Computational Intelligence System Identifying Cyber-Attacks on Smart Energy Grids , 2018 .

[43]  Konstantinos Demertzis,et al.  A Dynamic Ensemble Learning Framework for Data Stream Analysis and Real-Time Threat Detection , 2018, ICANN.

[44]  Konstantinos Demertzis,et al.  A Spiking One-Class Anomaly Detection Framework for Cyber-Security on Industrial Control Systems , 2017, EANN.

[45]  Kim Schaffer,et al.  An Overview of Anomaly Detection , 2013, IT Professional.