Co-Clustering Host-Domain Graphs to Discover Malware Infection

Malware is at root of most of cyber-attacks, which has led to billions of dollars in damage every year. Most malware, especially Advanced Persistent Threat (APT) malware make use of Domain Name System (DNS) to control compromised machines and steal sensitive information. Therefore, several security products identified malware infection by combining machine learning technology with DNS data. However, the existing detection approaches cannot simultaneously identify both malicious domain names and infected hosts. To solve the problem, this work proposed a co-clustering based detection approach without labeled data, which integrates active DNS data with graph inference. According to active DNS data, a host-domain graph was generated in the first. Then partial domain nodes were labeled under the aid of blacklist, popular domain list, and Alexa ranking. At last, semi-supervised co-clustering was used to discover potential malicious domains and malware-infected hosts in the monitored network. This work implemented experiments in a network of hundreds of internal hosts that access 145 malware domains. Experimental results showed that the proposed detection approach was able to identify malware domains with up to 97.2% true positives. This work also compared and analyzed the results using different cluster calculating formulas with two different bipartite edge weights. Results showed that clustering with maximum and minimum edge weights has a better tolerance to different distance calculation methods.

[1]  Heejo Lee,et al.  Identifying botnets by capturing group activities in DNS traffic , 2012, Comput. Networks.

[2]  Kangbin Yim,et al.  DGA-Based Botnet Detection Using DNS Traffic , 2013, J. Internet Serv. Inf. Secur..

[3]  Mohsen Guizani,et al.  An effective key management scheme for heterogeneous sensor networks , 2007, Ad Hoc Networks.

[4]  Richard J. Enbody,et al.  Targeted Cyberattacks: A Superset of Advanced Persistent Threats , 2013, IEEE Security & Privacy.

[5]  Reza Sharifnya,et al.  DFBotKiller: Domain-flux botnet detection based on the history of group activities and failures in DNS traffic , 2015, Digit. Investig..

[6]  Wilfried N. Gansterer,et al.  Mining agile DNS traffic using graph analysis for cybercrime detection , 2016, Comput. Networks.

[7]  Babak Rahbarinia,et al.  Segugio: Efficient Behavior-Based Tracking of Malware-Control Domains in Large ISP Networks , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[8]  Mohsen Guizani,et al.  A Routing-Driven Key Management Scheme for Heterogeneous Sensor Networks , 2007, 2007 IEEE International Conference on Communications.

[9]  Guowu Yang,et al.  Identifying APT Malware Domain Based on Mobile DNS Logging , 2017 .

[10]  Aziz Mohaisen,et al.  Kindred domains: detecting and clustering botnet domains using DNS traffic , 2014, WWW.

[11]  Nick Feamster,et al.  Building a Dynamic Reputation System for DNS , 2010, USENIX Security Symposium.

[12]  Sudip Saha,et al.  DNS for Massive-Scale Command and Control , 2013, IEEE Transactions on Dependable and Secure Computing.

[13]  Liang Shi,et al.  A Hybrid Learning from Multi-behavior for Malicious Domain Detection on Enterprise Network , 2015, 2015 IEEE International Conference on Data Mining Workshop (ICDMW).

[14]  Leyla Bilge,et al.  Exposure: A Passive DNS Analysis Service to Detect and Report Malicious Domains , 2014, TSEC.

[15]  Michele Colajanni,et al.  Analysis of high volumes of network traffic for Advanced Persistent Threat detection , 2016, Comput. Networks.

[16]  Ludwig Krippahl,et al.  CONDENSER: A Graph-Based Approachfor Detecting Botnets , 2014, ArXiv.

[17]  Futai Zou,et al.  Detecting Malware Based on DNS Graph Mining , 2015, Int. J. Distributed Sens. Networks.

[18]  Sandeep Yadav,et al.  Detecting Malicious Domains via Graph Inference , 2014, ESORICS.

[19]  Leyla Bilge,et al.  EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis , 2011, NDSS.

[20]  Jin Cao,et al.  Identifying suspicious activities through DNS failure graph analysis , 2010, The 18th IEEE International Conference on Network Protocols.

[21]  Joseph Gardiner,et al.  On the Security of Machine Learning in Malware C&C Detection , 2016, ACM Comput. Surv..

[22]  Xiaojiang Du,et al.  Security in wireless sensor networks , 2008, IEEE Wireless Communications.

[23]  Heejo Lee,et al.  GMAD: Graph-based Malware Activity Detection by DNS traffic analysis , 2014, Comput. Commun..

[24]  Mohsen Guizani,et al.  Transactions papers a routing-driven Elliptic Curve Cryptography based key management scheme for Heterogeneous Sensor Networks , 2009, IEEE Transactions on Wireless Communications.

[25]  Christian Platzer,et al.  Detecting malware's failover C&C strategies with squeeze , 2011, ACSAC '11.

[26]  Lijun Wu,et al.  An Adaptive Malicious Domain Detection Mechanism with DNS Traffic , 2017, ICNCC 2017.

[27]  Ting Yu,et al.  Discovering Malicious Domains through Passive DNS Data Graph Analysis , 2016, AsiaCCS.

[28]  Xiaojiang Du,et al.  A survey of key management schemes in wireless sensor networks , 2007, Comput. Commun..

[29]  B. Wu,et al.  Detecting APT Malware Infections Based on Malicious DNS and Traffic Analysis , 2015, IEEE Access.