Exploring NAT Detection and Host Identification Using Machine Learning

The usage of Network Address Translation (NAT) devices is common among end users, organizations, and Internet Service Providers. NAT provides anonymity for users within an organization by replacing their internal IP addresses with a single external wide area network address. While such anonymity provides an added measure of security for legitimate users, it can also be taken advantage of by malicious users hiding behind NAT devices. Thus, identifying NAT devices and hosts behind them is essential to detect malicious behaviors in traffic and application usage. In this paper, we propose a machine learning based solution to detect hosts behind NAT devices by using flow level statistics (excluding IP addresses, port numbers, and application layer information) from passive traffic measurements. We capture a large dataset and perform an extensive evaluation of our proposed approach with four existing approaches from the literature. Our results show that the proposed approach could identify NAT behaviors and hosts not only with higher accuracy but also demonstrates the impact of parameter sensitivity of the proposed approach.

[1]  Anja Feldmann,et al.  NAT Usage in Residential Broadband Networks , 2011, PAM.

[2]  Tomás Pevný,et al.  Passive NAT detection using HTTP access logs , 2016, 2016 IEEE International Workshop on Information Forensics and Security (WIFS).

[3]  Andra Lutu,et al.  NAT Revelio: Detecting NAT444 in the ISP , 2016, PAM.

[4]  Jan Vykopal,et al.  Netflow based system for NAT detection , 2009, Co-Next Student Workshop '09.

[5]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[6]  Van Jacobson,et al.  TCP Extensions for High Performance , 1992, RFC.

[7]  Marco Ajmone Marsan,et al.  Speedtest-Like Measurements in 3G/4G Networks: The MONROE Experience , 2017, 2017 29th International Teletraffic Congress (ITC 29).

[8]  Alfredo De Santis,et al.  Device Tracking in Private Networks via NAPT Log Analysis , 2012, 2012 Sixth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing.

[9]  Joseph D. Touch,et al.  Updated Specification of the IPv4 ID Field , 2013, RFC.

[10]  Kensuke Fukuda,et al.  Counting NATted hosts by observing TCP/IP field behaviors , 2012, 2012 IEEE International Conference on Communications (ICC).

[11]  A. NurZincir-Heywood,et al.  Can we identify NAT behavior by analyzing Traffic Flows , 2014 .

[12]  Ali Safari Khatouni,et al.  Integrating Machine Learning with Off-the-Shelf Traffic Flow Features for HTTP/HTTPS Traffic Classification , 2019, 2019 IEEE Symposium on Computers and Communications (ISCC).

[13]  Ulrike Meyer,et al.  IP agnostic real-time traffic filtering and host identification using TCP timestamps , 2013, 38th Annual IEEE Conference on Local Computer Networks.

[14]  R. Weaver,et al.  Visualizing and Modeling the Scanning Behavior of the Conficker Botnet in the Presence of User and Network Activity , 2015, IEEE Transactions on Information Forensics and Security.

[15]  Dario Rossi,et al.  Experiences of Internet traffic monitoring with tstat , 2011, IEEE Network.

[16]  Elie Bursztein Time has something to tell us about network address translation , 2007 .

[17]  Benoît Dupasquier,et al.  Tranalyzer: Versatile high performance network traffic analyser , 2016, 2016 IEEE Symposium Series on Computational Intelligence (SSCI).

[18]  Florian Wohlfart,et al.  Analysis and topology-based traversal of cascaded large scale NATs , 2013, HotMiddlebox '13.

[19]  Slobodan Petrovic,et al.  Passive Remote Source NAT Detection Using Behavior Statistics Derived from NetFlow , 2013, AIMS.

[20]  Andreas Terzis,et al.  On the impact of dynamic addressing on malware propagation , 2006, WORM '06.

[21]  Nino Vincenzo Verde,et al.  No NAT'd User Left Behind: Fingerprinting Users behind NAT from NetFlow Records Alone , 2014, 2014 IEEE 34th International Conference on Distributed Computing Systems.

[22]  Napoleon Paxton,et al.  Identifying network packets across translational boundaries , 2014, 10th IEEE International Conference on Collaborative Computing: Networking, Applications and Worksharing.