A fast and accurate threat detection and prevention architecture using stream processing

Late detection of security breaches increases the risk of irreparable damages and limits any mitigation attempts. We propose a fast and accurate threat detection and prevention architecture that combines the advantages of real‐time streaming with batch processing over a historical database. We create a dataset by capturing both legitimate and malicious traffic and propose two ways of combining packets into flows, one considering a time window and the other analyzing the first few packets of each flow per period. We also investigate the effectiveness of our proposal on real‐world network traces obtained from a significant Brazilian network operator providing broadband Internet to their customers. We implement and evaluate three classification algorithms and two anomaly detection methods. The results show an accuracy higher than 95% and an excellent trade‐off between attack detection and false‐positive rates. We further propose an improved scheme based on software defined networks that automatically prevents threats by analyzing only the first few packets of a flow. The proposal promptly and efficiently blocks threats, is robust, and can scale up, even when the attacker employs spoofed IP.

[1]  Salvatore J. Stolfo,et al.  Mining in a data-flow environment: experience in network intrusion detection , 1999, KDD '99.

[2]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[3]  Anukool Lakhina,et al.  Mining anomalies using traffic feature distributions , 2005, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[4]  John S. Baras,et al.  A framework for the evaluation of intrusion detection systems , 2006, 2006 IEEE Symposium on Security and Privacy (S&P'06).

[5]  Morteza Amini,et al.  RT-UNNID: A practical solution to real-time network-based intrusion detection using unsupervised neural networks , 2006, Comput. Secur..

[6]  Otto Carlos Muniz Bandeira Duarte,et al.  Towards Stateless Single-Packet IP Traceback , 2007, 32nd IEEE Conference on Local Computer Networks (LCN 2007).

[7]  Jennifer Rexford,et al.  Sensitivity of PCA for traffic anomaly detection , 2007, SIGMETRICS '07.

[8]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[9]  Vern Paxson,et al.  Outside the Closed World: On Using Machine Learning for Network Intrusion Detection , 2010, 2010 IEEE Symposium on Security and Privacy.

[10]  Kyumin Lee,et al.  You are where you tweet: a content-based approach to geo-locating twitter users , 2010, CIKM.

[11]  Otto Carlos Muniz Bandeira Duarte,et al.  XNetMon: A Network Monitor for Securing Virtual Networks , 2011, 2011 IEEE International Conference on Communications (ICC).

[12]  Otto Carlos Muniz Bandeira Duarte,et al.  A Stateless Traceback Technique for Identifying the Origin of Attacks from a Single Packet , 2011, 2011 IEEE International Conference on Communications (ICC).

[13]  Phurivit Sangkatsanee,et al.  Practical real-time intrusion detection using machine learning approaches , 2011, Comput. Commun..

[14]  Alvaro A. Cárdenas,et al.  Big Data Analytics for Security , 2013, IEEE Security & Privacy.

[15]  Francesco Palmieri,et al.  A distributed approach to network anomaly detection based on independent component analysis , 2014, Concurr. Comput. Pract. Exp..

[16]  Shan Suthaharan,et al.  Big data classification: problems and challenges in network intrusion prediction with machine learning , 2014, PERV.

[17]  Marcelo G. Rubinstein,et al.  Challenges and Research Directions for the Future Internetworking , 2014, IEEE Communications Surveys & Tutorials.

[18]  Marcelo G. Rubinstein,et al.  FITS: A flexible virtual network testbed architecture , 2014, Comput. Networks.

[19]  Pavel Smrz,et al.  Scheduling Decisions in Stream Processing on Heterogeneous Clusters , 2014, 2014 Eighth International Conference on Complex, Intelligent and Software Intensive Systems.

[20]  Sharath Chandra Guntuku,et al.  Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests , 2014, Inf. Sci..

[21]  Alejandro Zunino,et al.  An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[22]  Peter Clay A modern threat response framework , 2015, Netw. Secur..

[23]  Otto Carlos Muniz Bandeira Duarte,et al.  Providing elasticity to intrusion detection systems in virtualized Software Defined Networks , 2015, 2015 IEEE International Conference on Communications (ICC).

[24]  Nathan Marz,et al.  Big Data: Principles and best practices of scalable realtime data systems , 2015 .

[25]  Yuan-Cheng Lai,et al.  An extended SDN architecture for network function virtualization with a case study on intrusion prevention , 2015, IEEE Network.

[26]  Joel J. P. C. Rodrigues,et al.  Network anomaly detection using IP flows with Principal Component Analysis and Ant Colony Optimization , 2016, J. Netw. Comput. Appl..

[27]  Otto Carlos Muniz Bandeira Duarte,et al.  An elastic intrusion detection system for software networks , 2016, Ann. des Télécommunications.

[28]  Otto Carlos Muniz Bandeira Duarte,et al.  A Performance Comparison of Open-Source Stream Processing Platforms , 2016, 2016 IEEE Global Communications Conference (GLOBECOM).

[29]  Dong Hyun Jeong,et al.  A multi-level intrusion detection method for abnormal network behaviors , 2016, J. Netw. Comput. Appl..

[30]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[31]  Javier Del Ser,et al.  A novel machine learning approach to the detection of identity theft in social networks based on emulated attack instances and support vector machines , 2016, Concurr. Comput. Pract. Exp..

[32]  Mansour Sheikhan,et al.  Hybrid of anomaly-based and specification-based IDS for Internet of Things using unsupervised OPF based on MapReduce approach , 2017, Comput. Commun..

[33]  Bo Li,et al.  Towards a multi‐layers anomaly detection framework for analyzing network traffic , 2017, Concurr. Comput. Pract. Exp..

[34]  Mario Lemes Proença,et al.  Deep IP flow inspection to detect beyond network anomalies , 2017, Comput. Commun..

[35]  Otto Carlos Muniz Bandeira Duarte,et al.  An Adaptive Real-Time Architecture for Zero-Day Threat Detection , 2018, 2018 IEEE International Conference on Communications (ICC).

[36]  Wei Cai,et al.  A Survey on Security Threats and Defensive Techniques of Machine Learning: A Data Driven View , 2018, IEEE Access.

[37]  Jong Hyuk Park,et al.  OpCloudSec: Open cloud software defined wireless network security for the Internet of Things , 2018, Comput. Commun..

[38]  Martin Esteban Andreoni Lopez,et al.  A monitoring and threat detection system using stream processing as a virtual function for Big Data. (Un système de surveillance et détection de menaces utilisant le traitement de flux comme une fonction virtuelle pour le Big Data) , 2018, SBRC Companion.

[39]  Guy Pujolle,et al.  Toward a monitoring and threat detection system based on stream processing as a virtual network function for big data , 2019, Concurr. Comput. Pract. Exp..

[40]  Nuno Neves,et al.  BigFlow: Real-time and reliable anomaly-based intrusion detection for high-speed networks , 2019, Future Gener. Comput. Syst..

[41]  Ahmet Uyar,et al.  Streaming Machine Learning Algorithms with Big Data Systems , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[42]  Altair Olivo Santin,et al.  SDN-based and multitenant-aware resource provisioning mechanism for cloud-based big data streaming , 2019, J. Netw. Comput. Appl..

[43]  P. Mohan Kumar,et al.  Survey on DDoS defense mechanisms , 2020, Concurr. Comput. Pract. Exp..

[44]  P Chellammal,et al.  Real-time anomaly detection using parallelized intrusion detection architecture for streaming data , 2020, Concurr. Comput. Pract. Exp..

[45]  Geoffrey Fox,et al.  Twister2: Design of a big data toolkit , 2020, Concurr. Comput. Pract. Exp..

[46]  Suthendran Kannan,et al.  Detection and trace back of low and high volume of distributed denial‐of‐service attack based on statistical measures , 2019, Concurr. Comput. Pract. Exp..