MAGMA network behavior classifier for malware traffic

Malware is a major threat to security and privacy of network users. A large variety of malware is typically spread over the Internet, hiding in benign traffic. New types of malware appear every day, challenging both the research community and security companies to improve malware identification techniques. In this paper we present MAGMA, MultilAyer Graphs for MAlware detection, a novel malware behavioral classifier. Our system is based on a Big Data methodology, driven by real-world data obtained from traffic traces collected in an operational network. The methodology we propose automatically extracts patterns related to a specific input event, i.e., a seed, from the enormous amount of events the network carries. By correlating such activities over (i) time, (ii) space, and (iii) network protocols, we build a Network Connectivity Graph that captures the overall network behavior of the seed. We next extract features from the Connectivity Graph and design a supervised classifier. We run MAGMA on a large dataset collected from a commercial Internet Provider where 20,000 Internet users generated more than 330 million events. Only 42,000 are flagged as malicious by a commercial IDS, which we consider as an oracle. Using this dataset, we experimentally evaluate MAGMA accuracy and robustness to parameter settings. Results indicate that MAGMA reaches 95% accuracy, with limited false positives. Furthermore, MAGMA proves able to identify suspicious network events that the IDS ignored.

[1]  Sandeep Yadav,et al.  Detecting Malicious Domains via Graph Inference , 2014, ESORICS.

[2]  Wenke Lee,et al.  Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces , 2009, 2009 Annual Computer Security Applications Conference.

[3]  Michalis Faloutsos,et al.  PhishDef: URL names say it all , 2010, 2011 Proceedings IEEE INFOCOM.

[4]  Michalis Faloutsos,et al.  Entelecheia: Detecting P2P botnets in their waiting stage , 2013, 2013 IFIP Networking Conference.

[5]  Phillip A. Porras Inside risksReflections on Conficker , 2009, CACM.

[6]  Mark Stamp,et al.  HTTP attack detection using n-gram analysis , 2014, Comput. Secur..

[7]  H. Mannila,et al.  Discovering all most specific sentences , 2003, TODS.

[8]  Lei Liu,et al.  Detecting malicious clients in ISP networks using HTTP connectivity graph and flow information , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[9]  Wenke Lee,et al.  ARROW: GenerAting SignatuRes to Detect DRive-By DOWnloads , 2011, WWW.

[10]  Nir Kshetri,et al.  The Economics of Click Fraud , 2010, IEEE Secur. Priv..

[11]  Norbert Pohlmann,et al.  CoCoSpot: Clustering and recognizing botnet command and control channels using traffic analysis , 2013, Comput. Networks.

[12]  Roberto Perdisci,et al.  From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware , 2012, USENIX Security Symposium.

[13]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[14]  Anthony K. H. Tung,et al.  Carpenter: finding closed patterns in long biological datasets , 2003, KDD '03.

[15]  Vinod Yegneswaran,et al.  BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation , 2007, USENIX Security Symposium.

[16]  Christopher Krügel,et al.  Nazca: Detecting Malware Distribution in Large-Scale Networks , 2014, NDSS.

[17]  Wenke Lee,et al.  Connected Colors: Unveiling the Structure of Criminal Networks , 2013, RAID.

[18]  Guofei Gu,et al.  BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic , 2008, NDSS.

[19]  Radu State,et al.  BotTrack: Tracking Botnets Using NetFlow and PageRank , 2011, Networking.

[20]  Elena Baralis,et al.  SeaRum: A Cloud-Based Service for Association Rule Mining , 2013, 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications.

[21]  Elena Baralis,et al.  Network Connectivity Graph for Malicious Traffic Dissection , 2015, 2015 24th International Conference on Computer Communication and Networks (ICCCN).

[22]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.0 , 1996, RFC.

[23]  Stefan Savage,et al.  Manufacturing compromise: the emergence of exploit-as-a-service , 2012, CCS.

[24]  Jin Cao,et al.  Identifying suspicious activities through DNS failure graph analysis , 2010, The 18th IEEE International Conference on Network Protocols.

[25]  Guofei Gu,et al.  BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection , 2008, USENIX Security Symposium.

[26]  Yong Guan,et al.  Detecting Click Fraud in Pay-Per-Click Streams of Online Advertising Networks , 2008, 2008 The 28th International Conference on Distributed Computing Systems.

[27]  Lawrence K. Saul,et al.  Beyond blacklists: learning to detect malicious web sites from suspicious URLs , 2009, KDD.

[28]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Christopher Krügel,et al.  Detection and analysis of drive-by-download attacks and malicious JavaScript code , 2010, WWW '10.

[30]  Elena Baralis,et al.  Macroscopic view of malware in home networks , 2015, 2015 12th Annual IEEE Consumer Communications and Networking Conference (CCNC).