Dmap: Automating Domain Name Ecosystem Measurements and Applications

Behind each Internet domain name, there is a set of entities/companies responsible for delivering the various services associated with it, such as Web hosting and e-mail. Together, they form what we refer to as DNS ecosystem. Currently, there is no single measurement tool designed to measure this ecosystem altogether. As a result, researchers that aim at analyzing (parts of) this ecosystem often have to spend significant amounts of time preparing and executing the multiple application measurements and post-processing their heterogeneous raw datasets. Given that time is a scare resource, this complexity diverts researcher's time from actual analysis, ultimately limiting how far many studies go. To help researchers facing this situation, we present Dmap, an active measurement application that reduces the complexity of executing both measurements and analysis. It does so by (i) automating the crawling of several application protocols (DNS, HTTP, TLS/SSL, SMTP, both over IPv4 and IPv6) and (ii) storing the results into a relational data base, enabling researchers to quickly perform hypothesis tests within interactive response times using SQL. Dmap current version has 40 classifiers that generate 166 derived features (e.g., CMS detection, page language), which can be used by researchers and operators to build applications and services. We present an evaluation of Dmap and show three applications that it can be used for, such as profiling the Alexa 1 million domains. We use Dmap at SIDN (.nl registry) for research on the. nl zone and make it open-source for researchers.

[1]  Giovane C. M. Moura,et al.  ENTRADA: A high-performance network traffic data streaming warehouse , 2016, NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium.

[2]  Mark Allman,et al.  Ethical considerations in network measurement papers , 2016, Commun. ACM.

[3]  Aiko Pras,et al.  A High-Performance, Scalable Infrastructure for Large-Scale Active DNS Measurements , 2016, IEEE Journal on Selected Areas in Communications.

[4]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.

[5]  Gang Chen,et al.  An IPv6 Profile for 3GPP Mobile Devices , 2016, RFC.

[6]  Clinton Gormley,et al.  Elasticsearch: The Definitive Guide , 2015 .

[7]  Wouter Joosen,et al.  Herding Vulnerable Cats: A Statistical Approach to Disentangle Joint Responsibility for Web Security in Shared Hosting , 2017, CCS.

[8]  David Cooper,et al.  Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile , 2008, RFC.

[9]  Eric Wustrow,et al.  ZMap: Fast Internet-wide Scanning and Its Security Applications , 2013, USENIX Security Symposium.

[10]  Eric Rescorla,et al.  The Transport Layer Security (TLS) Protocol Version 1.3 , 2018, RFC.

[11]  Giovane C. M. Moura,et al.  No domain left behind: is Let's Encrypt democratizing encryption? , 2017, ANRW.

[12]  Paul V. Mockapetris,et al.  Domain names - concepts and facilities , 1987, RFC.

[13]  Giovane C. M. Moura,et al.  Anycast vs. DDoS: Evaluating the November 2015 Root DNS Event , 2016, Internet Measurement Conference.

[14]  Murray S. Kucherawy,et al.  DomainKeys Identified Mail (DKIM) Signatures , 2011, RFC.

[15]  Paul E. Hoffman,et al.  SMTP Service Extension for Secure SMTP over Transport Layer Security , 2002, RFC.

[16]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[17]  Giovane C. M. Moura,et al.  Increasing DNS Security and Stability through a Control Plane for Top-Level Domain Operators , 2017, IEEE Communications Magazine.

[18]  Wouter Joosen,et al.  Parking Sensors: Analyzing and Detecting Parked Domains , 2015, NDSS.

[19]  Martin Grund,et al.  Impala: A Modern, Open-Source SQL Engine for Hadoop , 2015, CIDR.

[20]  Tim Dierks,et al.  The Transport Layer Security (TLS) Protocol Version 1.2 , 2008 .

[21]  Vern Paxson,et al.  Strategies for sound internet measurement , 2004, IMC '04.

[22]  Arvind Narayanan,et al.  Online Tracking: A 1-million-site Measurement and Analysis , 2016, CCS.

[23]  Tyler Moore,et al.  Hacking Is Not Random: A Case-Control Study of Webserver-Compromise Risk , 2016, IEEE Transactions on Dependable and Secure Computing.