The Onions Have Eyes: A Comprehensive Structure and Privacy Analysis of Tor Hidden Services

Tor is a well known and widely used darknet, known for its anonymity. However, while its protocol and relay security have already been extensively studied, to date there is no comprehensive analysis of the structure and privacy of its Web Hidden Service. To fill this gap, we developed a dedicated analysis platform and used it to crawl and analyze over 1.5M URLs hosted in 7257 onion domains. For each page we analyzed its links, resources, and redirections graphs, as well as the language and category distribution. According to our experiments, Tor hidden services are organized in a sparse but highly connected graph, in which around 10% of the onions sites are completely isolated. Our study also measures for the first time the tight connection that exists between Tor hidden services and the Surface Web. In fact, more than 20% of the onion domains we visited imported resources from the Surface Web, and links to the Surface Web are even more prevalent than to other onion domains. Finally, we measured for the first time the prevalence and the nature of web tracking in Tor hidden services, showing that, albeit not as widespread as in the Surface Web, tracking is notably present also in the Dark Web: more than 40% of the scripts are used for this purpose, with the 70% of them being completely new tracking scripts unknown by existing anti-tracking solutions.

[1]  Sebastiano Vigna,et al.  Graph structure in the web --- revisited: a trick of the heavy tail , 2014, WWW.

[2]  Wouter Joosen,et al.  Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting , 2013, 2013 IEEE Symposium on Security and Privacy.

[3]  Jayant Madhavan,et al.  Google's Deep Web crawl , 2008, Proc. VLDB Endow..

[4]  Prateek Mittal,et al.  Stealthy traffic analysis of low-latency anonymous communication using throughput fingerprinting , 2011, CCS '11.

[5]  Guevara Noubir,et al.  OnionBots: Subverting Privacy Infrastructure for Cyber Attacks , 2015, 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks.

[6]  Nicolas Christin,et al.  Measuring the Longitudinal Evolution of the Online Anonymous Marketplace Ecosystem , 2015, USENIX Security Symposium.

[7]  Michael K. Bergman White Paper: The Deep Web: Surfacing Hidden Value , 2001 .

[8]  Stefan Lindskog,et al.  Spoiled Onions: Exposing Malicious Tor Exit Relays , 2014, Privacy Enhancing Technologies.

[9]  Prateek Mittal,et al.  RAPTOR: Routing Attacks on Privacy in Tor , 2015, USENIX Security Symposium.

[10]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[11]  Mitesh Patel,et al.  Accessing the deep web , 2007, CACM.

[12]  Angelos D. Keromytis,et al.  Detecting Traffic Snooping in Tor Using Decoys , 2011, RAID.

[13]  Tadayoshi Kohno,et al.  Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016 , 2016, USENIX Security Symposium.

[14]  Delbert Dueck,et al.  Clustering by Passing Messages Between Data Points , 2007, Science.

[15]  George Danezis,et al.  Low-cost traffic analysis of Tor , 2005, 2005 IEEE Symposium on Security and Privacy (S&P'05).

[16]  Walter Rudametkin,et al.  Beauty and the Beast: Diverting Modern Web Browsers to Build Unique Browser Fingerprints , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Frank Piessens,et al.  FPDetective: dusting the web for fingerprinters , 2013, CCS.

[18]  Guevara Noubir,et al.  Honey Onions: A framework for characterizing and identifying misbehaving Tor HSDirs , 2016, 2016 IEEE Conference on Communications and Network Security (CNS).

[19]  Marc Dacier,et al.  Circuit Fingerprinting Attacks: Passive Deanonymization of Tor Hidden Services , 2015, USENIX Security Symposium.

[20]  Paul F. Syverson,et al.  Anonymous connections and onion routing , 1998, IEEE J. Sel. Areas Commun..

[21]  Arvind Narayanan,et al.  The Web Never Forgets: Persistent Tracking Mechanisms in the Wild , 2014, CCS.

[22]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[23]  Alex Biryukov,et al.  Trawling for Tor Hidden Services: Detection, Measurement, Deanonymization , 2013, 2013 IEEE Symposium on Security and Privacy.

[24]  Wei Liu,et al.  ViDE: A Vision-Based Approach for Deep Web Data Extraction , 2010, IEEE Transactions on Knowledge and Data Engineering.

[25]  Björn Scheuermann,et al.  The Sniper Attack: Anonymously Deanonymizing and Disabling the Tor Network , 2014, NDSS.

[26]  Dirk Grunwald,et al.  Shining Light in Dark Places: Understanding the Tor Network , 2008, Privacy Enhancing Technologies.