No boundaries: data exfiltration by third parties embedded on web pages

Abstract We investigate data exfiltration by third-party scripts directly embedded on web pages. Specifically, we study three attacks: misuse of browsers’ internal login managers, social data exfiltration, and whole-DOM exfiltration. Although the possibility of these attacks was well known, we provide the first empirical evidence based on measurements of 300,000 distinct web pages from 50,000 sites. We extend OpenWPM’s instrumentation to detect and precisely attribute these attacks to specific third-party scripts. Our analysis reveals invasive practices such as inserting invisible login forms to trigger autofilling of the saved user credentials, and reading and exfiltrating social network data when the user logs in via Facebook login. Further, we uncovered password, credit card, and health data leaks to third parties due to wholesale collection of the DOM. We discuss the lessons learned from the responses to the initial disclosure of our findings and fixes that were deployed by the websites, browser vendors, third-party libraries and privacy protection tools.

[1]  赵志刚 信用证中的条款(TERMS)与条件(CONDITIONS) , 2000 .

[2]  Balachander Krishnamurthy,et al.  On the leakage of personally identifiable information via online social networks , 2009, CCRV.

[3]  Balachander Krishnamurthy,et al.  Privacy Leakage in Mobile Online Social Networks , 2010, WOSN.

[4]  Benjamin Livshits,et al.  ConScript: Specifying and Enforcing Fine-Grained Security Policies for JavaScript in the Browser , 2010, 2010 IEEE Symposium on Security and Privacy.

[5]  Balachander Krishnamurthy,et al.  Privacy leakage vs . Protection measures : the growing disconnect , 2011 .

[6]  Dominique Devriese,et al.  FlowFox: a web browser with flexible and precise information flow control , 2012, CCS '12.

[7]  Arnar Birgisson,et al.  JSFlow: tracking information flow in JavaScript and its APIs , 2014, SAC.

[8]  Vern Paxson,et al.  Towards Mining Latent Client Identifiers from Network Traffic , 2016, Proc. Priv. Enhancing Technol..

[9]  Arvind Narayanan,et al.  Online Tracking: A 1-million-site Measurement and Analysis , 2016, CCS.

[10]  Arnaud Legout,et al.  ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic , 2015, MobiSys.

[11]  Nick Nikiforakis,et al.  Are You Sure You Want to Contact Us? Quantifying the Leakage of PII via Website Contact Forms , 2016, Proc. Priv. Enhancing Technol..

[12]  Narseo Vallina-Rodriguez,et al.  Lumen: Fine-Grained Visibility and Control of Mobile Traffic in User-Space , 2017 .

[13]  Steven M. Bellovin,et al.  A Privacy Analysis of Cross-device Tracking , 2017, USENIX Security Symposium.

[14]  Nick Nikiforakis,et al.  Extended Tracking Powers: Measuring the Privacy Diffusion Enabled by Browser Extensions , 2017, WWW.

[15]  Aaron Alva,et al.  Cross-Device Tracking: Measurement and Disclosures , 2017, Proc. Priv. Enhancing Technol..

[16]  Arvind Narayanan,et al.  I never signed up for this! Privacy implications of email tracking , 2018, Proc. Priv. Enhancing Technol..

[17]  Narseo Vallina-Rodriguez,et al.  Apps, Trackers, Privacy, and Regulators: A Global Study of the Mobile Tracking Ecosystem , 2018, NDSS.

[18]  Narseo Vallina-Rodriguez,et al.  Bug Fixes, Improvements,... and Privacy Leaks , 2018 .

[19]  J. Murphy The General Data Protection Regulation (GDPR) , 2018, Irish medical journal.

[20]  Narseo Vallina-Rodriguez,et al.  “Won’t Somebody Think of the Children?” Examining COPPA Compliance at Scale , 2018, Proc. Priv. Enhancing Technol..

[21]  Narseo Vallina-Rodriguez,et al.  Apophanies or Epiphanies? How Crawlers Impact Our Understanding of the Web , 2020, WWW.

[22]  Ilana Segall,et al.  The Representativeness of Automated Web Crawls as a Surrogate for Human Browsing , 2020, WWW.