High Precision Open-World Website Fingerprinting

Traffic analysis attacks to identify which web page a client is browsing, using only her packet metadata — known as website fingerprinting (WF) — has been proven effective in closed-world experiments against privacy technologies like Tor. We want to investigate their usefulness in the real open world. Several WF attacks claim to have high recall and low false positive rate, but they have only been shown to succeed against high base rate pages. We explicitly incorporate the base rate into precision and call it r-precision. Using this metric, we show that the best previous attacks have poor precision when the base rate is realistically low; we study such a scenario (r = 1000), where the maximum r-precision achieved was only 0.14.To improve r-precision, we propose three novel classes of precision optimizers that can be applied to any classifier to increase precision. For r = 1000, our best optimized classifier can achieve a precision of at least 0.86, representing a precision increase by more than 6 times. For the first time, we show a WF classifier that can scale to any open world set size. We also investigate the use of precise classifiers to tackle realistic objectives in website fingerprinting, including different types of websites, identification of sensitive clients, and defeating website fingerprinting defenses.

[1]  Andrew Hintz,et al.  Fingerprinting Websites Using Traffic Analysis , 2002, Privacy Enhancing Technologies.

[2]  Hannes Federrath,et al.  Website fingerprinting: attacking popular privacy enhancing technologies with the multinomial naïve-bayes classifier , 2009, CCSW '09.

[3]  Tadeusz Pietraszek,et al.  Using Adaptive Alert Classification to Reduce False Positives in Intrusion Detection , 2004, RAID.

[4]  Law. Policy Executive Summary of the National Academies of Science Reports, Strengthening Forensic Science in the United States: A Path Forward , 2009 .

[5]  Tao Wang,et al.  A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses , 2014, CCS.

[6]  Rachel Greenstadt,et al.  How Unique is Your .onion?: An Analysis of the Fingerprintability of Tor Onion Services , 2017, CCS.

[7]  David D. Jensen,et al.  Privacy Vulnerabilities in Encrypted HTTP Streams , 2005, Privacy Enhancing Technologies.

[8]  J. Koehler,et al.  The Coming Paradigm Shift in Forensic Identification Science , 2005, Science.

[9]  Tao Wang,et al.  Website Fingerprinting: Attacks and Defenses , 2016 .

[10]  Brian Neil Levine,et al.  Inferring the source of encrypted HTTP connections , 2006, CCS '06.

[11]  Mohsen Imani,et al.  Deep Fingerprinting: Undermining Website Fingerprinting Defenses with Deep Learning , 2018, CCS.

[12]  Mun Choon Chan,et al.  Website Fingerprinting and Identification Using Ordered Feature Sequences , 2010, ESORICS.

[13]  Thomas Engel,et al.  Website fingerprinting in onion routing based anonymization networks , 2011, WPES.

[14]  Rachel Greenstadt,et al.  A Critical Evaluation of Website Fingerprinting Attacks , 2014, CCS.

[15]  Tao Wang,et al.  Improved website fingerprinting on Tor , 2013, WPES.

[16]  George Danezis,et al.  k-fingerprinting: A Robust Scalable Website Fingerprinting Technique , 2015, USENIX Security Symposium.

[17]  Mike Perry,et al.  Toward an Efficient Website Fingerprinting Defense , 2015, ESORICS.

[18]  Tao Wang,et al.  On Realistically Attacking Tor with Website Fingerprinting , 2016, Proc. Priv. Enhancing Technol..

[19]  Brijesh Joshi,et al.  Touching from a distance: website fingerprinting attacks and defenses , 2012, CCS.

[20]  Claudia Díaz,et al.  Inside Job: Applying Traffic Analysis to Measure Tor from Within , 2018, NDSS.

[21]  Klaus Wehrle,et al.  Website Fingerprinting at Internet Scale , 2016, NDSS.

[22]  Xiang Cai,et al.  Glove: A Bespoke Website Fingerprinting Defense , 2014, WPES.

[23]  J. Swets ROC analysis applied to the evaluation of medical imaging techniques. , 1979, Investigative radiology.

[24]  Lili Qiu,et al.  Statistical identification of encrypted Web browsing traffic , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[25]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.