Accurate DNS query characteristics estimation via active probing

As the hidden backbone of today's Internet, the Domain Name System (DNS) provides name resolution service for almost every networked application. To exploit the rich DNS query information for traffic engineering or user behavior analysis, both passive capturing and active probing techniques have been proposed in recent years. Despite its full visibility of DNS behaviors, the passive capturing technique suffers from prohibitive management cost and results in tremendous privacy concerns towards its large-scale and collaborative deployment. Comparatively, the active probing technique overcomes these limitations, providing broad-view and privacy-preserving DNS query analysis at the cost of constrained visibility of fine-grained DNS behavior. This paper aims to accurately estimate DNS query characteristics based on DNS cache activities, which can be acquired via active probing on a large scale at negligible management cost and minimized privacy concerns. Specifically, we have made three contributions: (1) we propose a novel solution, which integrates the renewal theory-based DNS caching formulation and the hyper-exponential distribution model. The solution offers great flexibility to model various domains; (2) we perform a large-scale real-world DNS trace measurement, and demonstrate that our solution significantly improves the estimation accuracy; (3) we apply our solution to estimate the malware-infected host population in remote management networks. The experimental results have demonstrated that our solution can achieve high estimation accuracy and outperforms the existing method.

[1]  René L. Schilling,et al.  Bernstein Functions: Theory and Applications , 2010 .

[2]  Tongdan Jin,et al.  Exponential approximation to Weibull renewal with decreasing failure rate , 2010 .

[3]  J. Vries De Gruyter Studies in Mathematics , 2014, USCO and Quasicontinuous Mappings.

[4]  John G. Proakis,et al.  Probability, random variables and stochastic processes , 1985, IEEE Trans. Acoust. Speech Signal Process..

[5]  Robert Tappan Morris,et al.  DNS performance and the effectiveness of caching , 2001, IMW '01.

[6]  Keisuke Ishibashi,et al.  Extending Black Domain Name List by Using Co-occurrence Relation between DNS Queries , 2010, LEET.

[7]  R. D. Carmichael,et al.  Textbooks in mathematics , 1931 .

[8]  Heejo Lee,et al.  BotGAD: detecting botnets by capturing group activities in network traffic , 2009, COMSWARE '09.

[9]  Andreas Terzis,et al.  Peeking Through the Cloud: DNS-Based Estimation and Its Applications , 2008, ACNS.

[10]  Wenke Lee,et al.  Global Internet Monitoring Using Passive DNS , 2009, 2009 Cybersecurity Applications & Technology Conference for Homeland Security.

[11]  Charles E. Roberts Ordinary Differential Equations: Applications, Models, and Computing , 2010 .

[12]  田端 利宏,et al.  Network and Distributed System Security Symposiumにおける研究動向の調査 , 2004 .

[13]  Hari Balakrishnan,et al.  Modeling TTL-based Internet caches , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[14]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Leyla Bilge,et al.  EXPOSURE: Finding Malicious Domains Using Passive DNS Analysis , 2011, NDSS.

[16]  Lakhmi C. Jain,et al.  Network and information security: A computational intelligence approach: Special Issue of Journal of Network and Computer Applications , 2007, J. Netw. Comput. Appl..

[17]  Jin Cao,et al.  Identifying suspicious activities through DNS failure graph analysis , 2010, The 18th IEEE International Conference on Network Protocols.

[18]  Bruno O. Shubert,et al.  Random variables and stochastic processes , 1979 .

[19]  Steven Diamond,et al.  Blueprint for the Intercloud - Protocols and Formats for Cloud Computing Interoperability , 2009, 2009 Fourth International Conference on Internet and Web Applications and Services.

[20]  Kang G. Shin,et al.  RB-Seeker: Auto-detection of Redirection Botnets , 2009, NDSS.

[21]  José Carlos Brustoloni,et al.  Bayesian bot detection based on DNS traffic similarity , 2009, SAC '09.

[22]  Andreas Terzis,et al.  A multifaceted approach to understanding the botnet phenomenon , 2006, IMC '06.

[23]  Torsten Suel,et al.  Geographic web usage estimation by monitoring DNS caches , 2008, LocWeb.

[24]  Wenke Lee,et al.  ARROW: GenerAting SignatuRes to Detect DRive-By DOWnloads , 2011, WWW.

[25]  Joseph T. Chang Inequalities for the Overshoot , 1994 .

[26]  Fang Hao,et al.  Unreeling netflix: Understanding and improving multi-CDN movie delivery , 2012, 2012 Proceedings IEEE INFOCOM.

[27]  Gianluca Stringhini,et al.  COMPA: Detecting Compromised Accounts on Social Networks , 2013, NDSS.

[28]  Sandeep Yadav,et al.  Winning with DNS Failures: Strategies for Faster Botnet Detection , 2011, SecureComm.

[29]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[30]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[31]  Ahmed Ali Mohammed,et al.  Integral transforms and their applications , 2009 .

[32]  Nick Feamster,et al.  Building a Dynamic Reputation System for DNS , 2010, USENIX Security Symposium.

[33]  Anja Feldmann,et al.  Fitting mixtures of exponentials to long-tail distributions to analyze network performance models , 1997, Proceedings of INFOCOM '97.

[34]  Raul H. C. Lopes,et al.  Pengaruh Latihan Small Sided Games 4 Lawan 4 Dengan Maksimal Tiga Sentuhan Terhadap Peningkatan VO2MAX Pada Siswa SSB Tunas Muda Bragang Klampis U-15 , 2022, Jurnal Ilmiah Mandala Education.