An Automated Approach to Auditing Disclosure of Third-Party Data Collection in Website Privacy Policies

A dominant regulatory model for web privacy is "notice and choice". In this model, users are notified of data collection and provided with options to control it. To examine the efficacy of this approach, this study presents the first large-scale audit of disclosure of third-party data collection in website privacy policies. Data flows on one million websites are analyzed and over 200,000 websites' privacy policies are audited to determine if users are notified of the names of the companies which collect their data. Policies from 25 prominent third-party data collectors are also examined to provide deeper insights into the totality of the policy environment. Policies are additionally audited to determine if the choice expressed by the "Do Not Track" browser setting is respected. Third-party data collection is wide-spread, but fewer than 15% of attributed data flows are disclosed. The third-parties most likely to be disclosed are those with consumer services users may be aware of, those without consumer services are less likely to be mentioned. Policies are difficult to understand and the average time requirement to read both a given site»s policy and the associated third-party policies exceeds 84 minutes. Only 7% of first-party site policies mention the Do Not Track signal, and the majority of such mentions are to specify that the signal is ignored. Among third-party policies examined, none offer unqualified support for the Do Not Track signal. Findings indicate that current implementations of "notice and choice" fail to provide notice or respect choice.

[1]  Aleecia M. McDonald,et al.  The Cost of Reading Privacy Policies , 2009 .

[2]  Jun Zhao,et al.  Better the Devil You Know: Exposing the Data Sharing Practices of Smartphone Apps , 2017, CHI.

[3]  Mark S. Ackerman,et al.  Beyond Concern: Understanding Net Users' Attitudes About Online Privacy , 1999, ArXiv.

[4]  Christian Hauschke,et al.  Third-Party-Elemente in deutschen Bibliothekswebseiten , 2016 .

[5]  R. Shay,et al.  AdChoices? Compliance with Online Behavioral Advertising Notice and Choice Requirements (CMU-CyLab-11-005) , 2011 .

[6]  Daniel J. Solove,et al.  Introduction: Privacy Self-Management and the Consent Dilemma , 2013 .

[7]  Nora A Draper,et al.  The Tradeoff Fallacy: How Marketers are Misrepresenting American Consumers and Opening Them Up to Exploitation , 2015 .

[8]  Balachander Krishnamurthy,et al.  WWW 2009 MADRID! Track: Security and Privacy / Session: Web Privacy Privacy Diffusion on the Web: A Longitudinal Perspective , 2022 .

[9]  Blase Ur,et al.  A Large-Scale Evaluation of U.S. Financial Institutions’ Standardized Privacy Notices , 2016 .

[10]  Patrick Gallinari,et al.  Document structure meets page layout: loopy random fields for web news content extraction , 2010, DocEng '10.

[11]  Peter Fankhauser,et al.  Boilerplate detection using shallow text features , 2010, WSDM '10.

[12]  Yang Wang,et al.  Smart, useful, scary, creepy: perceptions of online behavioral advertising , 2012, SOUPS.

[13]  Noah A. Smith,et al.  Crowdsourcing Annotations for Websites' Privacy Policies: Can It Really Work? , 2016, WWW.

[14]  Michael Hennessy,et al.  Internet privacy and institutional trust , 2007, New Media Soc..

[15]  Georgios Zervas,et al.  Understanding Emerging Threats to Online Advertising , 2016, EC.

[16]  Edward W. Felten,et al.  Cookies That Give You Away: The Surveillance Implications of Web Tracking , 2015, WWW.

[17]  Robert Gellman,et al.  Fair Information Practices: A Basic History - Version 2.20 , 2017 .

[18]  Arvind Narayanan,et al.  I never signed up for this! Privacy implications of email tracking , 2018, Proc. Priv. Enhancing Technol..

[19]  Lorrie Faith Cranor,et al.  Disagreeable Privacy Policies: Mismatches between Meaning and Users’ Understanding , 2014 .

[20]  G. Loewenstein,et al.  Privacy and human behavior in the age of information , 2015, Science.

[21]  R. Shay,et al.  AdChoices? Compliance with Online Behavioral Advertising Notice and Choice Requirements. Revised Version , 2011 .

[22]  Arvind Narayanan,et al.  Online Tracking: A 1-million-site Measurement and Analysis , 2016, CCS.

[23]  Helen Nissenbaum,et al.  On Notice: The Trouble with Notice and Consent , 2009 .

[24]  Alessandro Acquisti,et al.  Privacy and rationality in individual decision making , 2005, IEEE Security & Privacy.

[25]  F. Cate The Failure of Fair Information Practice Principles , 2006 .

[26]  David Wetherall,et al.  Detecting and Defending Against Third-Party Tracking on the Web , 2012, NSDI.

[27]  Christo Wilson,et al.  Tracing Information Flows Between Ad Exchanges Using Retargeted Ads , 2018, USENIX Security Symposium.

[28]  Jinyan Zang,et al.  Who Knows What About Me? A Survey of Behind the Scenes Personal Data Sharing to Third Parties by Mobile Apps , 2015 .

[29]  Arvind Narayanan,et al.  Do Not Track: A Universal Third-Party Web Tracking Opt Out , 2011 .

[30]  Paul Barford,et al.  An Empirical Study of Web Cookies , 2016, WWW.

[31]  Oscar H. Gandy,et al.  Public Opinion Surveys and the Formation of Privacy Policy , 2003 .

[32]  Martino Trevisan,et al.  Uncovering the Flop of the EU Cookie Law , 2017, ArXiv.

[33]  Balachander Krishnamurthy,et al.  Best paper -- Follow the money: understanding economics of online aggregation and advertising , 2013, Internet Measurement Conference.

[34]  J. Reeve,et al.  Solutions to problematic polypharmacy: learning from the expertise of patients. , 2015, The British journal of general practice : the journal of the Royal College of General Practitioners.

[35]  Timothy Libert,et al.  Privacy implications of health information seeking on the web , 2014, Commun. ACM.

[36]  Timothy Libert,et al.  Exposing the Hidden Web: An Analysis of Third-Party HTTP Requests on 1 Million Websites , 2015, ArXiv.

[37]  Balachander Krishnamurthy,et al.  Generating a privacy footprint on the internet , 2006, IMC '06.

[38]  Tadayoshi Kohno,et al.  Internet Jones and the Raiders of the Lost Trackers: An Archaeological Study of Web Tracking from 1996 to 2016 , 2016, USENIX Security Symposium.

[39]  Lorrie Faith Cranor,et al.  A Comparative Study of Online Privacy Policies and Formats , 2009, Privacy Enhancing Technologies.

[40]  Arvind Narayanan,et al.  The Web Never Forgets: Persistent Tracking Mechanisms in the Wild , 2014, CCS.