HARPO: Learning to Subvert Online Behavioral Advertising

—Online behavioral advertising, and the associated tracking paraphernalia, poses a real privacy threat. Unfortu-nately, existing privacy-enhancing tools are not always effective against online advertising and tracking. We propose H ARPO , a principled learning-based approach to subvert online behav- ioral advertising through obfuscation. H ARPO uses reinforcement learning to adaptively interleave real page visits with fake pages to distort a tracker’s view of a user’s browsing profile. We evaluate H ARPO against real-world user profiling and ad targeting models used for online behavioral advertising. The results show that H ARPO improves privacy by triggering more than 40% incorrect interest segments and 6 × higher bid values. H ARPO outperforms existing obfuscation tools by as much as 16 × for the same overhead. H ARPO is also able to achieve better stealthiness to adversarial detection than existing obfuscation tools. H ARPO meaningfully advances the state-of-the-art in leveraging obfuscation to subvert online behavioral advertising.

[1]  University of California,et al.  Fingerprinting the Fingerprinters: Learning to Detect Browser Fingerprinting Behaviors , 2020, 2021 IEEE Symposium on Security and Privacy (SP).

[2]  J. Scholten Why am I seeing this ad? , 2019 .

[3]  M. Kosinski,et al.  Psychological targeting as an effective approach to digital mass persuasion , 2017, Proceedings of the National Academy of Sciences.

[4]  Arvind Narayanan,et al.  Privacy Policies over Time: Curation and Analysis of a Million-Document Dataset , 2020, WWW.

[5]  Jason Polakis,et al.  Carnus: Exploring the Privacy Threats of Browser Extension Fingerprinting , 2020, NDSS.

[6]  Christo Wilson,et al.  How Tracking Companies Circumvented Ad Blockers Using WebSockets , 2018, Internet Measurement Conference.

[7]  Wenke Lee,et al.  TrackMeOrNot: Enabling Flexible Control on Web Tracking , 2016, WWW.

[8]  Sencun Zhu,et al.  Errors, Misunderstandings, and Attacks: Analyzing the Crowdsourcing Process of Ad-blocking Systems , 2019, Internet Measurement Conference.

[9]  Zhiyuan Liu,et al.  A C-LSTM Neural Network for Text Classification , 2015, ArXiv.

[10]  Xiangyu Zhang,et al.  AdBudgetKiller: Online Advertising Budget Draining Attack , 2018, WWW.

[11]  Edgar R. Weippl,et al.  Block Me If You Can: A Large-Scale Study of Tracker-Blocking Tools , 2017, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).

[12]  Yang Wang,et al.  Smart, useful, scary, creepy: perceptions of online behavioral advertising , 2012, SOUPS.

[13]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[14]  Ghazaleh Beigi,et al.  Protecting User Privacy: An Approach for Untraceable Web Browsing History and Unambiguous User Profiles , 2018, WSDM.

[15]  Ravi Kumar,et al.  Are web users really Markovian? , 2012, WWW.

[16]  Zhiyun Qian,et al.  Detecting Anti Ad-blockers in the Wild , 2017, Proc. Priv. Enhancing Technol..

[17]  Evangelos P. Markatos,et al.  Cookie Synchronization: Everything You Always Wanted to Know But Were Afraid to Ask , 2018, WWW.

[18]  Peng Liu,et al.  A Machine Learning Approach for Detecting Third-Party Trackers on the Web , 2016, ESORICS.

[19]  Bill Fitzgerald,et al.  Tracking the Trackers , 2016 .

[20]  Zhiyuan Xu,et al.  Experience-Driven Congestion Control: When Multi-Path TCP Meets Deep Reinforcement Learning , 2019, IEEE Journal on Selected Areas in Communications.

[21]  Arvind Narayanan,et al.  The Future of Ad Blocking: An Analytical Framework and New Techniques , 2017, ArXiv.

[22]  Arvind Narayanan,et al.  The Web Never Forgets: Persistent Tracking Mechanisms in the Wild , 2014, CCS.

[23]  Tobias Dehling,et al.  Consumer Perceptions of Online Behavioral Advertising , 2019, 2019 IEEE 21st Conference on Business Informatics (CBI).

[24]  Norbert Pohlmann,et al.  A Study on Subject Data Access in Online Advertising After the GDPR , 2019, DPM/CBT@ESORICS.

[25]  Mireille Hildebrandt,et al.  Privacy, Due Process and the Computational Turn : The Philosophy of Law Meets the Philosophy of Technology , 2013 .

[26]  Annie I. Antón,et al.  An Empirical Study of Consumer Perceptions and Comprehension of Web Site Privacy Policies , 2008, IEEE Transactions on Engineering Management.

[27]  Benjamin Livshits,et al.  Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking , 2020, SIGMETRICS.

[28]  K. Fukuda,et al.  Characterizing CNAME Cloaking-based Tracking on the Web , 2020, TMA.

[29]  Paul A. Gagniuc,et al.  Markov Chains: From Theory to Implementation and Experimentation , 2017 .

[30]  Patrick Th. Eugster,et al.  WebRanz: web page randomization for better advertisement delivery and web-bot prevention , 2016, SIGSOFT FSE.

[31]  Finn Brunton,et al.  Political and ethical perspectives on data obfuscation , 2013 .

[32]  Nick Feamster,et al.  Take This Personally: Pollution Attacks on Personalized Services , 2013, USENIX Security Symposium.

[33]  Jan Nierhoff,et al.  Tracking and Tricking a Profiler: Automated Measuring and Influencing of Bluekai's Interest Profiling , 2018, WPES@CCS.

[34]  Athina Markopoulou,et al.  NoMoAds: Effective and Efficient Cross-App Mobile Ad-Blocking , 2018, Proc. Priv. Enhancing Technol..

[35]  Wenke Lee,et al.  Your Online Interests: Pwned! A Pollution Attack Against Targeted Advertising , 2014, CCS.

[36]  Measuring Abuse in Web Push Advertising , 2020, ArXiv.

[37]  Leslie K. John,et al.  Why Am I Seeing This Ad? The Effect of Ad Transparency on Ad Effectiveness , 2019 .

[38]  Andrei Sabelfeld,et al.  Discovering Browser Extensions via Web Accessible Resources , 2017, CODASPY.

[39]  Zubair Shafiq,et al.  Inferring Tracker-Advertiser Relationships in the Online Advertising Ecosystem using Header Bidding , 2019, Proc. Priv. Enhancing Technol..

[40]  Christo Wilson,et al.  Quantity vs. Quality: Evaluating User Interest Profiles Using Ad Preference Managers , 2019, NDSS.

[41]  Sonia Chiasson,et al.  User Perceptions of Sharing, Advertising, and Tracking , 2015, SOUPS.

[42]  Helen Nissenbaum,et al.  Engineering Privacy and Protest: A Case Study of AdNauseam , 2017, IWPE@SP.

[43]  Rico Neumann,et al.  Obfuscation: A user’s guide for privacy and protest , 2017, New Media Soc..

[44]  Wouter Joosen,et al.  Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting , 2013, 2013 IEEE Symposium on Security and Privacy.

[45]  J. Murphy The General Data Protection Regulation (GDPR) , 2018, Irish medical journal.

[46]  Gerhard Weikum,et al.  Privacy through Solidarity: A User-Utility-Preserving Framework to Counter Profiling , 2017, SIGIR.

[47]  Stephen Farrell,et al.  Pervasive Monitoring Is an Attack , 2014, RFC.

[48]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[49]  Balachander Krishnamurthy,et al.  Towards Seamless Tracking-Free Web: Improved Detection of Trackers via One-class Learning , 2016, Proc. Priv. Enhancing Technol..

[50]  Garrett A. Johnson,et al.  Consumer Privacy Choice in Online Advertising: Who Opts Out and at What Cost to Industry? , 2020, Mark. Sci..

[51]  The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power , 2020 .

[52]  Jonathan Mayer,et al.  A Promising Direction for Web Tracking Countermeasures , 2013 .

[53]  Peter Snyder,et al.  Detecting Filter List Evasion with Event-Loop-Turn Granularity JavaScript Signatures , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[54]  John Cook,et al.  Inferring Tracker-Advertiser Relationships in the Online Advertising Ecosystem using Header Bidding , 2020, Proc. Priv. Enhancing Technol..

[55]  Benjamin Livshits,et al.  AdGraph: A Graph-Based Approach to Ad and Tracker Blocking , 2020, 2020 IEEE Symposium on Security and Privacy (SP).

[56]  Nick Nikiforakis,et al.  XHOUND: Quantifying the Fingerprintability of Browser Extensions , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[57]  Evangelos P. Markatos,et al.  No More Chasing Waterfalls: A Measurement Study of the Header Bidding Ad-Ecosystem , 2019, Internet Measurement Conference.

[58]  Emiliano De Cristofaro,et al.  Adblocking and Counter Blocking: A Slice of the Arms Race , 2016, FOCI.

[59]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[60]  Pierre Laperdrix,et al.  Fingerprinting in Style: Detecting Browser Extensions via Injected Style Sheets , 2021, USENIX Security Symposium.

[61]  Athina Markopoulou,et al.  CV-Inspector: Towards Automating Detection of Adblock Circumvention , 2021, NDSS.

[62]  Claude Castelluccia,et al.  Selling Off Privacy at Auction , 2014, NDSS 2014.

[63]  Nitish Korula,et al.  Effect of disabling third-party cookies on publisher revenue , 2022 .

[64]  Claude Castelluccia,et al.  MyTrackingChoices: Pacifying the Ad-Block War by Enforcing User Privacy Preferences , 2016, ArXiv.

[65]  Edward W. Felten,et al.  Cookies That Give You Away: The Surveillance Implications of Web Tracking , 2015, WWW.

[66]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[67]  Peter Stone,et al.  Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.

[68]  Bernhard Ager,et al.  An Automated Approach for Complementing Ad Blockers’ Blacklists , 2015, Proc. Priv. Enhancing Technol..

[69]  Shoshana Zuboff The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power , 2019 .

[70]  Léon Bottou,et al.  Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[71]  Pablo Rodriguez,et al.  If you are not paying for it, you are the product: how much do advertisers pay to reach you? , 2017, Internet Measurement Conference.