Network Data Analysis Using Spark

With the huge increase in the volume of network traffic, there is a need for network monitoring systems that capture network packets and provide packet features in near real time to protect from attacks. As a first step towards developing such a system using distributed computation, new system has been developed in Spark, a cluster computing system, which extracts packet features with less memory consumption and at a faster rate. Traffic analysis and extraction of packet features are carried out using streaming capability inherent in Spark. Analysing the network data features provide a means for detecting attacks. This paper describes a system for the analysis of network data using Spark streaming technology which focuses on real time stream processing, built on top of Spark.