Cyber-attack detection via non-linear prediction of IP addresses: an innovative big data analytics approach

Computer network systems are often subject to several types of attacks. For example, an excessive traffic load sent to a web server for making it unusable is the main technique introduced by the Distributed Denial of Service (DDoS) attack. A well-known method for detecting attacks consists in analyzing the sequence of source IP addresses for detecting possible anomalies. With the aim of predicting the next IP address, the Probability Density Function of the IP address sequence is estimated. Anomalous requests are detected via predicting source’s IP addresses in future accesses to the server. Thus, when an access to the server occurs, the server accepts only the requests from the predicted IP addresses and it blocks all the others. The approaches used to estimate the Probability Density Function of IP addresses range from the sequence of IP addresses seen previously and stored in a database to address clustering, for instance via the K-Means algorithm. Instead, the sequence of IP addresses is considered as a numerical sequence in this paper, and non-linear analysis of this numerical sequence is applied. In particular, we exploited non-linear analysis based on Volterra Kernels and Hammerstein models. The experiments carried out with datasets of source IP address sequences show that the prediction errors obtained with Hammerstein models are smaller than those obtained both with the Volterra Kernels and with the sequence clustering based on the K-Means algorithm.

[1]  Alfredo Cuzzocrea,et al.  Privacy Preserving OLAP and OLAP Security , 2009, Encyclopedia of Data Warehousing and Mining.

[2]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[3]  Peter Toth Adaptive online learning environment and web usage mining , 2013, 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI).

[4]  Jiashu Zhang,et al.  Adaptively Combined FIR and Functional Link Artificial Neural Network Equalizer for Nonlinear Communication Channel , 2009, IEEE Transactions on Neural Networks.

[5]  Iqbal H. Sarker,et al.  Cybersecurity data science: an overview from machine learning perspective , 2020, Journal of Big Data.

[6]  Minho Park,et al.  Efficient Distributed Denial-of-Service Attack Defense in SDN-Based Cloud , 2019, IEEE Access.

[7]  Raman Ramsin,et al.  Methodologies for developing knowledge management systems: an evaluation framework , 2015, J. Knowl. Manag..

[8]  Alfredo Cuzzocrea,et al.  Storing and retrieving XPath fragments in structured P2P networks , 2006, Data Knowl. Eng..

[9]  Christoph H. Lampert,et al.  Bayes Optimal DDoS Mitigation by Adaptive History-Based IP Filtering , 2008, Seventh International Conference on Networking (icn 2008).

[10]  Alfredo Cuzzocrea,et al.  Combining multidimensional user models and knowledge representation and management techniques for making web services knowledge-aware , 2006, Web Intell. Agent Syst..

[11]  Edoardo Fadda,et al.  A robust optimization approach to kernel-based nonparametric error-in-variables identification in the presence of bounded noise , 2017, 2017 American Control Conference (ACC).

[12]  Guandong Xu,et al.  OLAP*: Effectively and Efficiently Supporting Parallel OLAP over Big Data , 2013, MEDI.

[13]  ZhiKe Peng,et al.  Volterra series theory: A state-of-the-art review , 2015 .

[14]  Grigoris Antoniou,et al.  A Theoretical Study of Anomaly Detection in Big Data Distributed Static and Stream Analytics , 2018, 2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS).

[15]  Dimitrios Gunopulos,et al.  A novel distributed framework for optimizing query routing trees in wireless sensor networks via optimal operator placement , 2013, J. Comput. Syst. Sci..

[16]  Elisa Bertino,et al.  Privacy Preserving OLAP over Distributed XML Data: A Theoretically-Sound Secure-Multiparty-Computation Approach , 2011, J. Comput. Syst. Sci..

[17]  Ahlem Abid,et al.  Intrusion Detection based on Graph oriented Big Data Analytics , 2020, KES.

[18]  Balachander Krishnamurthy,et al.  Flash crowds and denial of service attacks: characterization and implications for CDNs and web sites , 2002, WWW.

[19]  Gwanggil Jeon,et al.  Hybrid Approach for IP Traceback Analysis in Wireless Networks , 2020, Wirel. Pers. Commun..

[20]  Sungjune Park,et al.  Sequence-based clustering for Web usage mining: A new experimental framework and ANN-enhanced K-means algorithm , 2008, Data Knowl. Eng..

[21]  Mario Cannataro,et al.  XAHM: an adaptive hypermedia model based on XML , 2002, SEKE '02.

[22]  Mimmo Parente,et al.  Towards OLAP Analysis of Multidimensional Tweet Streams , 2015, DOLAP.

[23]  Alfredo Cuzzocrea,et al.  Big Graph Analytics: The State of the Art and Future Research Agenda , 2014, DOLAP '14.

[24]  Lidong Wang,et al.  Big Data Analytics in Cybersecurity: Network Data and Intrusion Prediction , 2019, 2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON).

[25]  Chung-Horng Lung,et al.  The role of traffic forecasting in QoS routing - a case study of time-dependent routing , 2005, IEEE International Conference on Communications, 2005. ICC 2005. 2005.

[26]  Kotagiri Ramamohanarao,et al.  Protection from distributed denial of service attacks using history-based IP filtering , 2003, IEEE International Conference on Communications, 2003. ICC '03..

[27]  Sven Dietrich,et al.  Analyzing Distributed Denial of Service Tools: The Shaft Case , 2000, LISA.

[28]  Cristian Estan,et al.  On Filtering of DDoS Attacks Based on Source Address Prefixes , 2006, 2006 Securecomm and Workshops.

[29]  Henri Casanova,et al.  Clustering hosts in P2P and global computing platforms , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[30]  Mimmo Parente,et al.  OLAP analysis of multidimensional tweet streams for supporting advanced analytics , 2016, SAC.

[31]  Guido Perboli,et al.  A progressive hedging method for the optimization of social engagement and opportunistic IoT problems , 2019, Eur. J. Oper. Res..

[32]  Giovanni Squillero,et al.  Adaptive Batteries Exploiting On-Line Steady-State Evolution Strategy , 2017, EvoApplications.

[33]  Arian Bär,et al.  IP mining: Extracting knowledge from the dynamics of the Internet addressing space , 2013, Proceedings of the 2013 25th International Teletraffic Congress (ITC).

[34]  W.K.G. Seah,et al.  Framework for statistical filtering against DDoS attacks in MANETs , 2005, Second International Conference on Embedded Software and Systems (ICESS'05).