On the uniqueness of Web browsing history patterns

We present the results of the first large-scale study of the uniqueness of Web browsing histories, gathered from a total of 368,284 Internet users who visited a history detection demonstration website. Our results show that for a majority of users (69 %), the browsing history is unique and that users for whom we could detect at least four visited websites were uniquely identified by their histories in 97 % of cases. We observe a significant rate of stability in browser history fingerprints: for repeat visitors, 38 % of fingerprints are identical over time, and differing ones were correlated with original history contents, indicating static browsing preferences (for history subvectors of size 50). We report a striking result that it is enough to test for a small number of pages in order to both enumerate users’ interests and perform an efficient and unique behavioral fingerprint; we show that testing 50 Web pages is enough to fingerprint 42 % of users in our database, increasing to 70 % with 500 Web pages.

[1]  Peter Eckersley,et al.  How Unique Is Your Web Browser? , 2010, Privacy Enhancing Technologies.

[2]  Zachary Weinberg,et al.  I Still Know What You Visited Last Summer: Leaking Browsing History via User Interaction and Side Channel Attacks , 2011, 2011 IEEE Symposium on Security and Privacy.

[3]  Lukasz Olejnik,et al.  Web Browser History Detection as a Real-World Privacy Threat , 2010, ESORICS.

[4]  Claude Castelluccia,et al.  Towards Web-Based Biometric Systems Using Personal Browsing Interests , 2013, 2013 International Conference on Availability, Reliability and Security.

[5]  Dan Boneh,et al.  Exposing private information by timing web applications , 2007, WWW '07.

[6]  Hovav Shacham,et al.  Fingerprinting Information in JavaScript Implementations , 2011 .

[7]  Dan Boneh,et al.  Protecting browser state from web privacy attacks , 2006, WWW '06.

[8]  Mark Crovella,et al.  Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement , 2003 .

[9]  Xin Huang,et al.  Browser Fingerprinting from Coarse Traffic Summaries: Techniques and Implications , 2009, DIMVA.

[10]  Venu Govindaraju,et al.  Behavioural biometrics: a survey and classification , 2008, Int. J. Biom..

[11]  Sebastian Möller,et al.  Identity theft, computers and behavioral biometrics , 2009, 2009 IEEE International Conference on Intelligence and Security Informatics.

[12]  Craig E. Wills,et al.  Inferring relative popularity of internet applications by actively querying DNS caches , 2003, IMC '03.

[13]  Christopher Krügel,et al.  A Practical Attack to De-anonymize Social Network Users , 2010, 2010 IEEE Symposium on Security and Privacy.

[14]  Martín Abadi,et al.  Host Fingerprinting and Tracking on the Web: Privacy and Security Implications , 2012, NDSS.

[15]  Sorin Lerner,et al.  An empirical study of privacy-violating information flows in JavaScript web applications , 2010, CCS '10.

[16]  Martin Halvey,et al.  WWW '07: Proceedings of the 16th international conference on World Wide Web , 2007, WWW 2007.

[17]  Edward W. Felten,et al.  Timing attacks on Web privacy , 2000, CCS.

[18]  Balachander Krishnamurthy,et al.  WWW 2009 MADRID! Track: Security and Privacy / Session: Web Privacy Privacy Diffusion on the Web: A Longitudinal Perspective , 2022 .

[19]  Fabian Monrose,et al.  DNS Prefetching and Its Privacy Implications: When Good Things Go Bad , 2010, LEET.

[20]  B. Miller,et al.  Vital signs of identity [biometrics] , 1994, IEEE Spectrum.