ADAM & RAL: Adaptive Memory Learning and Reinforcement Active Learning for Network Monitoring

Network-traffic data commonly arrives in the form of fast data streams; online network-monitoring systems continuously analyze these kinds of streams, sequentially collecting measurements over time. Continuous and dynamic learning is an effective learning strategy when operating in these fast and dynamic environments, where concept drifts constantly occur. In this paper, we propose different approaches for stream-based machine learning, able to analyze network-traffic streams on the fly, using supervised learning techniques. We address two major challenges associated to stream-based machine learning and online network monitoring: (i) how to dynamically learn from and adapt to non-stationary data and patterns changing over time, and (ii) how to deal with the limited availability of ground truth or labeled data to continuously tune a supervised learning model. We introduce ADAM * RAL, two stream-based machine-learning approaches to tackle these challenges. ADAM implements multiple stream-based machine-learning models and relies on an adaptive memory strategy to dynamically adapt the size of the system’s learning memory to the most recent data distribution, triggering new learning steps when concept drifts are detected. RAL implements a stream-based active-learning strategy to reduce the amount of labeled data needed for streambased learning, dynamically deciding on the most informative samples to integrate into the continuous learning scheme. Using a reinforcement learning loop, RAL improves prediction performance by additionally learning from the goodness of its previous sample-selection decisions. We focus on a particularly challenging problem in network monitoring: continuously tuning detection models able to recognize network attacks over time.By continuously learning from and detecting concept drifts within real network measurements, we show that ADAM * RAL can continuously achieve high detection accuracy and limit the amount of training data needed to detect attacks over dynamic network data streams.

[1]  Michael Stonebraker,et al.  The 8 requirements of real-time stream processing , 2005, SGMD.

[2]  A. Bifet,et al.  A survey on concept drift adaptation , 2014, ACM Comput. Surv..

[3]  Linqi Song,et al.  Stream-based Online Active Learning in a Contextual Multi-Armed Bandit Framework , 2016, ArXiv.

[4]  Nitesh V. Chawla,et al.  Noname manuscript No. (will be inserted by the editor) Learning from Streaming Data with Concept Drift and Imbalance: An Overview , 2022 .

[5]  VARUN CHANDOLA,et al.  Anomaly detection: A survey , 2009, CSUR.

[6]  Bartosz Krawczyk,et al.  Active and adaptive ensemble learning for online activity recognition from data streams , 2017, Knowl. Based Syst..

[7]  João Gama,et al.  On evaluating stream learning algorithms , 2012, Machine Learning.

[8]  Kensuke Fukuda,et al.  GML learning, a generic machine learning model for network measurements analysis , 2017, 2017 13th International Conference on Network and Service Management (CNSM).

[9]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[10]  Pedro Casas,et al.  Ensemble-learning Approaches for Network Security and Anomaly Detection , 2017, Big-DAMA@SIGCOMM.

[11]  Peter Auer,et al.  The Nonstochastic Multiarmed Bandit Problem , 2002, SIAM J. Comput..

[12]  Jie Xu,et al.  A Contextual Bandit Approach for Stream-Based Active Learning , 2017, ArXiv.

[13]  Mohiuddin Ahmed,et al.  A survey of network anomaly detection techniques , 2016, J. Netw. Comput. Appl..

[14]  Geoff Holmes,et al.  Active Learning With Drifting Streaming Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[15]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[16]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[17]  Kensuke Fukuda,et al.  A streaming flow-based technique for traffic classification applied to 12 + 1 years of Internet traffic , 2016, Telecommun. Syst..

[18]  Hsuan-Tien Lin,et al.  Active Learning by Learning , 2015, AAAI.

[19]  Albert Bifet,et al.  Efficient Online Evaluation of Big Data Stream Classifiers , 2015, KDD.

[20]  Ricard Gavaldà,et al.  Learning from Time-Changing Data with Adaptive Windowing , 2007, SDM.

[21]  Talel Abdessalem,et al.  Adaptive random forests for evolving data stream classification , 2017, Machine Learning.

[22]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[23]  Wenhua Xu,et al.  Active learning over evolving data streams using paired ensemble framework , 2016, 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI).

[24]  Dino Ienco,et al.  Clustering Based Active Learning for Evolving Data Streams , 2013, Discovery Science.

[25]  Ran El-Yaniv,et al.  Online Choice of Active Learning Algorithms , 2003, J. Mach. Learn. Res..

[26]  Qingbo Yang,et al.  A Survey of Anomaly Detection Methods in Networks , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[27]  Kensuke Fukuda,et al.  MAWILab: combining diverse anomaly detectors for automated anomaly labeling and performance benchmarking , 2010, CoNEXT.

[28]  Pedro Casas,et al.  Super learning for anomaly detection in cellular networks , 2017, 2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob).

[29]  Raouf Boutaba,et al.  A comprehensive survey on machine learning for networking: evolution, applications and research opportunities , 2018, Journal of Internet Services and Applications.

[30]  João Gama,et al.  Issues in evaluation of stream learning algorithms , 2009, KDD.

[31]  Geoff Hulten,et al.  Catching up with the Data: Research Issues in Mining Data Streams , 2001, DMKD.

[32]  Geoff Holmes,et al.  Active Learning with Evolving Streaming Data , 2011, ECML/PKDD.

[33]  Pedro Casas,et al.  Network security and anomaly detection with Big-DAMA, a big data analytics framework , 2017, 2017 IEEE 6th International Conference on Cloud Networking (CloudNet).

[34]  Pedro M. Domingos,et al.  Mining massive data streams , 2005 .

[35]  Geoff Holmes,et al.  MOA: Massive Online Analysis , 2010, J. Mach. Learn. Res..