Detecting Port and Net Scan using Apache Spark

Today, due to the high number of attacks and of anomalous events in network traffic, the network anomaly detection has become an important research area. In fact it is necessary to detect all behaviors which do not comply with a well-defined notion of a normal behavior in order to avoid further harms. The two most spread network anomalies related to network security are port and net scan, activities performed by a malicious host to find and examine potential victims. In this work, a novel approach for detecting port and net scan using Big Data Analytics framework is presented. The approach works at flow level and has been conceived to detect such anomalous events on high-speed networks in a short time. In accordance with this approach, an algorithm has been created able to detect IP addresses that generate port and net scanning activities, and suited for the execution on Apache Spark framework. The paper we describe the approach and the algorithm proposed and then presents an experimental analysis of its performance, containing also a comparison with Mawilab archieve. The execution time of the algorithm has also been experimentally evaluated, running Apache Spark on a private Cloud and proved to be very short even on large traffic traces. Moreover, results of this comparison show that the algorithm is highly efficient in terms of Precision and Recall for port and net scan detection. Anomalies not detected by the gold standard are also detected by our approach.

[1]  홍원기,et al.  A Flow-based Method for Abnormal Network Traffic Detection , 2004 .

[2]  Philip K. Chan,et al.  Learning Rules and Clusters for Anomaly Detection in Network Traffic , 2005 .

[3]  Aiko Pras,et al.  An Overview of IP Flow-Based Intrusion Detection , 2010, IEEE Communications Surveys & Tutorials.

[4]  Bernhard Plattner,et al.  Entropy based worm and anomaly detection in fast IP networks , 2005, 14th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprise (WETICE'05).

[5]  Kensuke Fukuda,et al.  Scaling in Internet Traffic: A 14 Year and 3 Day Longitudinal Study, With Multiscale Analyses and Random Projections , 2017, IEEE/ACM Transactions on Networking.

[6]  Abhishek Kumar,et al.  Detection of Super Sources and Destinations in High-Speed Networks: Algorithms, Analysis and Evaluation , 2006, IEEE Journal on Selected Areas in Communications.

[7]  Tariq Rahim Soomro,et al.  Big Data Analysis: Apache Spark Perspective , 2015 .

[8]  Kensuke Fukuda,et al.  Seven Years and One Day: Sketching the Evolution of Internet Traffic , 2009, IEEE INFOCOM 2009.

[9]  Jürgen Quittek,et al.  Requirements for IP Flow Information Export (IPFIX) , 2004, RFC.

[10]  Kensuke Fukuda,et al.  A taxonomy of anomalies in backbone network traffic , 2014, 2014 International Wireless Communications and Mobile Computing Conference (IWCMC).

[11]  Vijeta Kumawat,et al.  Implementation of Spark Cluster Technique with SCALA , 2012 .