Revisiting IoT Fingerprinting behind a NAT

The growing usage of Network Address Translation (NAT) over the past couple of years has become a double-edged sword. On one hand, it provides an added measure of security for legitimate users. On the other hand, the anonymity provided by NAT could undoubtedly be leveraged by malicious actors. To this end, the objective of fingerprinting devices behind a NAT aims at properly comprehending the nature of such devices while aiding in proper network and security provisioning and characterization. While the problem scope is certainly not new, it has been evolving quite rapidly given the wide macroscopic and microscopic deployments of IoT devices, and have recently attracted significant attention from the research community. Throughout this paper, we revisit the task of classifying IoT devices deployed within a localized IoT realm. In contrast to the state-of-the-art, we explore the capabilities of unsupervised and semi-supervised shallow and deep learning methodologies in capturing the nature of such NATed devices. Initial results using empirical data indicate the failure of clustering methods in fingerprinting both IoT and non-IoT nodes behind a NAT. Further, and knowing that IoT devices typically possess relatively consistent traffic patterns, the results shed light on the unexpected capability of autoencoders in better capturing non-IoT devices (in contrast to IoT nodes) behind a NAT. In an effort to comprehend such results, we implement and evaluate an explainable mechanism that provides preliminary insights into this phenomena. Additionally, the developed semi-supervised Restricted Boltzmann Machine (RBM) approach generated comparative results to the state-of-the-art, without relying on the stringent and adhoc process of features’ engineering, with relatively very good algorithm complexity and scalability. Results from this work put forward interesting future work in the area of network traffic analysis of NATed IoT devices, while highlighting the need for addressing the notions of explainability and concept drift.