Large-scale classification of IPv6-IPv4 siblings with variable clock skew

Linking the growing IPv6 deployment to existing IPv4 addresses is an interesting field of research, be it for network forensics, structural analysis, or reconnaissance. In this work, we focus on classifying pairs of server IPv6 and IPv4 addresses as siblings, i.e., running on the same machine. Our methodology leverages active measurements of TCP timestamps and other network characteristics, which we measure against a diverse ground truth of 682 hosts. We define and extract a set of features, including estimation of variable (opposed to constant) remote clock skew. On these features, we train a manually crafted algorithm as well as a machine-learned decision tree. By conducting several measurement runs and training in cross-validation rounds, we aim to create models that generalize well and do not overfit our training data. We find both models to exceed 99% precision in train and test performance. We validate scalability by classifying 149k siblings in a large-scale measurement of 371k sibling candidates. We argue that this methodology, thoroughly cross-validated and likely to generalize well, can aid comparative studies of IPv6 and IPv4 behavior in the Internet. Striving for applicability and replicability, we release ready-to-use source code and raw data from our study.

[1]  Sebastian Zander,et al.  An Improved Clock-skew Measurement Technique for Revealing Hidden Services , 2008, USENIX Security Symposium.

[2]  Kimberly C. Claffy,et al.  Internet-Scale IPv4 Alias Resolution With MIDAR , 2013, IEEE/ACM Transactions on Networking.

[3]  Mark Allman,et al.  Ethical considerations in network measurement papers , 2016, Commun. ACM.

[4]  Darryl Veitch,et al.  Network Timing and the 2015 Leap Second , 2016, PAM.

[5]  Vaibhav Bajpai,et al.  IPv4 versus IPv6 - who connects faster? , 2015, 2015 IFIP Networking Conference (IFIP Networking).

[6]  Steven J. Murdoch,et al.  Hot or not: revealing hidden services by their clock skew , 2006, CCS '06.

[7]  Donald F. Towsley,et al.  Estimation and removal of clock skew from network delay measurements , 1999, IEEE INFOCOM '99. Conference on Computer Communications. Proceedings. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. The Future is Now (Cat. No.99CH36320).

[8]  Eric Wustrow,et al.  ZMap: Fast Internet-wide Scanning and Its Security Applications , 2013, USENIX Security Symposium.

[9]  Robert Beverly,et al.  Internet nameserver IPv4 and IPv6 address relationships , 2013, Internet Measurement Conference.

[10]  Vern Paxson,et al.  End-to-end Internet packet dynamics , 1997, SIGCOMM '97.

[11]  Robert Beverly,et al.  Server Siblings: Identifying Shared IPv4/IPv6 Infrastructure Via Active Fingerprinting , 2015, PAM.

[12]  Robert Beverly,et al.  Measuring and Characterizing IPv6 Router Availability , 2015, PAM.

[13]  Robert Beverly,et al.  Speedtrap: internet-scale IPv6 alias resolution , 2013, Internet Measurement Conference.

[14]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[15]  Georg Carle,et al.  Scanning the IPv6 Internet: Towards a Comprehensive Hitlist , 2016, TMA.

[16]  H. Theil A Rank-Invariant Method of Linear and Polynomial Regression Analysis , 1992 .

[17]  T. Kohno,et al.  Remote physical device fingerprinting , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[18]  Daniel Raumer,et al.  MoonGen: A Scriptable High-Speed Packet Generator , 2014, Internet Measurement Conference.

[19]  Rob Sherwood,et al.  Fixing ally's growing pains with velocity modeling , 2008, IMC '08.

[20]  Ratul Mahajan,et al.  Measuring ISP topologies with Rocketfuel , 2004, IEEE/ACM Transactions on Networking.

[21]  Vaibhav Bajpai,et al.  Lessons Learned From Using the RIPE Atlas Platform for Measurement Research , 2015, CCRV.

[22]  Mark Allman,et al.  Don't Forget to Lock the Back Door! A Characterization of IPv6 Network Security Policy , 2016, NDSS.

[23]  David Malone The Leap Second Behaviour of NTP Servers , 2016, TMA.

[24]  Olivier Bonaventure,et al.  Revealing middlebox interference with tracebox , 2013, Internet Measurement Conference.

[25]  Robert Beverly,et al.  Inferring Internet Server IPV4 and IPV6 Address Relationships , 2013 .

[26]  Georg Carle,et al.  Towards an Ecosystem for Reproducible Research in Computer Networking , 2017, Reproducibility@SIGCOMM.

[27]  Vern Paxson,et al.  On calibrating measurements of packet transit times , 1998, SIGMETRICS '98/PERFORMANCE '98.