Polls, clickbait, and commemorative $2 bills: problematic political advertising on news and media websites around the 2020 U.S. elections

Online advertising can be used to mislead, deceive, and manipulate Internet users, and political advertising is no exception. In this paper, we present a measurement study of online advertising around the 2020 United States elections, with a focus on identifying dark patterns and other potentially problematic content in political advertising. We scraped ad content on 745 news and media websites from six geographic locations in the U.S. from September 2020 to January 2021, collecting 1.4 million ads. We perform a systematic qualitative analysis of political content in these ads, as well as a quantitative analysis of the distribution of political ads on different types of websites. Our findings reveal the widespread use of problematic tactics in political ads, such as bait-and-switch ads formatted as opinion polls to entice users to click, the use of political controversy by content farms for clickbait, and the more frequent occurrence of political ads on highly partisan news websites. We make policy recommendations for online political advertising, including greater scrutiny of non-official political ads and comprehensive standards across advertising platforms.

[1]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[2]  Eric Van Steenburg,et al.  Areas of research in political advertising: a review and research agenda , 2015 .

[3]  Georgina Kennedy,et al.  Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection , 2016, Journal of medical Internet research.

[4]  Gianluca Stringhini,et al.  The Dark Alleys of Madison Avenue: Understanding Malicious Advertisements , 2014, Internet Measurement Conference.

[5]  Christo Wilson,et al.  Tracing Information Flows Between Ad Exchanges Using Retargeted Ads , 2018, USENIX Security Symposium.

[6]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[7]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[8]  Oana Goga,et al.  Understanding the Complexity of Detecting Political Ads , 2021, WWW.

[9]  Jordan Wolf,et al.  Disclaiming responsibility: How platforms deadlocked the Federal Election Commission's efforts to regulate digital political advertising , 2019, Telecommunications Policy.

[10]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[11]  Li Yun,et al.  Short Text Topic Modeling Techniques, Applications, and Performance: A Survey , 2019, IEEE Transactions on Knowledge and Data Engineering.

[12]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[13]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[14]  Christopher D. Manning,et al.  Stanza: A Python Natural Language Processing Toolkit for Many Human Languages , 2020, ACL.

[15]  G. M. Allan,et al.  Kappa statistic , 2005, Canadian Medical Association Journal.

[16]  James Bailey,et al.  Adjusting for Chance Clustering Comparison Measures , 2015, J. Mach. Learn. Res..

[17]  Brandon M Stewart,et al.  Manipulative tactics are the norm in political emails: Evidence from 300K emails from the 2020 US election cycle , 2023, Big Data & Society.

[18]  E. V. Steenburg,et al.  Areas of research in political advertising: a review and research agenda , 2015 .

[19]  L. L. Kaid Political Advertising in the United States , 2006 .

[20]  Piotr Sapiezynski,et al.  Discrimination through Optimization , 2019, Proc. ACM Hum. Comput. Interact..

[21]  Jorge Delva,et al.  The presidential election. , 2008, Social work.

[22]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[23]  Johnny Saldaña,et al.  The Coding Manual for Qualitative Researchers , 2009 .

[24]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[25]  R. Posner The Federal Trade Commission , 1969 .

[26]  Arvind Narayanan,et al.  Dark patterns , 2020, Commun. ACM.

[27]  Shekhar Misra Deceptive Political Advertising: Some New Dimensions , 2015 .

[28]  Tet Hin Yeap,et al.  Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis , 2020, Frontiers in Artificial Intelligence.

[29]  Tobias Lauinger,et al.  A Security Analysis of the Facebook Ad Library , 2020, 2020 IEEE Symposium on Security and Privacy (SP).

[30]  Xiang Pan,et al.  Are these Ads Safe: Detecting Hidden Attacks through the Mobile App-Web Interfaces , 2016, NDSS.

[31]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[32]  Ashley Carr LibGuides: Low Quality Websites: Content Farms: What is a Content Farm? , 2011 .

[33]  Niels Provos,et al.  All Your iFRAMEs Point to Us , 2008, USENIX Security Symposium.

[34]  Wouter Joosen,et al.  Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting , 2013, 2013 IEEE Symposium on Security and Privacy.

[35]  Pamela E. Grimm,et al.  The Challenges Native Advertising Poses: Exploring Potential Federal Trade Commission Responses and Identifying Research Needs , 2018, Journal of Public Policy & Marketing.

[36]  Wei Meng,et al.  Understanding Malvertising Through Ad-Injecting Browser Extensions , 2015, WWW.

[37]  Paul G. Allen,et al.  What Makes a “Bad” Ad? User Perceptions of Problematic Online Advertising , 2021, CHI.

[38]  Arvind Narayanan,et al.  An Empirical Study of Affiliate Marketing Disclosures on YouTube and Pinterest. , 2018 .

[39]  David Wetherall,et al.  Detecting and Defending Against Third-Party Tracking on the Web , 2012, NSDI.

[40]  G. Raskutti,et al.  The Stealth Media? Groups and Targets behind Divisive Issue Campaigns on Facebook , 2018, Political Communication.

[41]  Arvind Narayanan,et al.  Dark patterns , 2020, ACM Queue.

[42]  Roberto Perdisci,et al.  Towards Measuring and Mitigating Social Engineering Software Download Attacks , 2016, USENIX Security Symposium.

[43]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[44]  Tadayoshi Kohno,et al.  Bad News: Clickbait and Deceptive Ads on News and Misinformation Websites , 2020 .

[45]  D. Balzarotti,et al.  Journey to the Center of the Cookie Ecosystem: Unraveling Actors' Roles and Relationships , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[46]  A. Korolova,et al.  Discrimination through Optimization , 2019, Proc. ACM Hum. Comput. Interact..

[47]  M. McHugh Interrater reliability: the kappa statistic , 2012, Biochemia medica.

[48]  Crain,et al.  Political Manipulation and Internet Advertising Infrastructure , 2019, Journal of Information Policy.

[49]  Chinmay Kulkarni,et al.  Auditing Digital Platforms for Discrimination in Economic Opportunity Advertising , 2020, ArXiv.

[50]  Jianyong Wang,et al.  A dirichlet multinomial mixture model-based approach for short text clustering , 2014, KDD.

[51]  Arvind Narayanan,et al.  Endorsements on Social Media , 2018, Proc. ACM Hum. Comput. Interact..

[52]  Scot Burton,et al.  A New Era at Journal of Public Policy & Marketing Begins , 2019, Journal of Public Policy & Marketing.

[53]  Fang Yu,et al.  Knowing your enemy: understanding and detecting malicious web advertising , 2012, CCS '12.

[54]  Krishna P. Gummadi,et al.  Privacy Risks with Facebook's PII-Based Targeting: Auditing a Data Broker's Advertising Interface , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[55]  Ling Chen,et al.  Multi-layer multi-view topic model for classifying advertising video , 2017, Pattern Recognit..

[56]  Wouter Joosen,et al.  Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation , 2018, NDSS.

[57]  Tobias Konitzer,et al.  Campaigning Online: Web Display Ads in the 2012 Presidential Campaign , 2016, PS: Political Science & Politics.

[58]  Travis N. Ridout,et al.  Spending Fast and Furious: Political Advertising in 2020 , 2021 .

[59]  Roxana Geambasu,et al.  Sunlight: Fine-grained Targeting Detection at Scale with Statistical Confidence , 2015, CCS.

[60]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[61]  M. Cugmas,et al.  On comparing partitions , 2015 .

[62]  Damon McCoy,et al.  An Analysis of United States Online Political Advertising Transparency , 2019, ArXiv.

[63]  Valerie Gray Hardcastle,et al.  Ways of Knowing , 1996, Consciousness and Cognition.

[64]  Paul J. Kennedy,et al.  An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit , 2020, Inf. Process. Manag..