Problematic Machine Behavior

While algorithm audits are growing rapidly in commonality and public importance, relatively little scholarly work has gone toward synthesizing prior work and strategizing future research in the area. This systematic literature review aims to do just that, following PRISMA guidelines in a review of over 500 English articles that yielded 62 algorithm audit studies. The studies are synthesized and organized primarily by behavior (discrimination, distortion, exploitation, and misjudgement), with codes also provided for domain (e.g. search, vision, advertising, etc.), organization (e.g. Google, Facebook, Amazon, etc.), and audit method (e.g. sock puppet, direct scrape, crowdsourcing, etc.). The review shows how previous audit studies have exposed public-facing algorithms exhibiting problematic behavior, such as search algorithms culpable of distortion and advertising algorithms culpable of discrimination. Based on the studies reviewed, it also suggests some behaviors (e.g. discrimination on the basis of intersectional identities), domains (e.g. advertising algorithms), methods (e.g. code auditing), and organizations (e.g. Twitter, TikTok, LinkedIn) that call for future audit attention. The paper concludes by offering the common ingredients of successful audits, and discussing algorithm auditing in the context of broader research working toward algorithmic justice.

[1]  Pedro O. S. Vaz de Melo,et al.  Facebook Ads Monitor: An Independent Auditing System for Political Ads on Facebook , 2020, WWW.

[2]  J. Wobbrock,et al.  Research contributions in human-computer interaction , 2016, Interactions.

[3]  V. Braun,et al.  Using thematic analysis in psychology , 2006 .

[4]  Karrie Karahalios,et al.  MapWatch: Detecting and Monitoring International Border Personalization on Online Maps , 2016, WWW.

[5]  Catherine E. Tucker,et al.  Algorithmic Bias? An Empirical Study of Apparent Gender-Based Discrimination in the Display of STEM Career Ads , 2019, Manag. Sci..

[6]  Desheng Hu,et al.  Auditing the Partisanship of Google Search Snippets , 2019, WWW.

[7]  Maria Eriksson,et al.  Tracking Gendered Streams , 2017 .

[8]  Martin Reinhart,et al.  The visibility of scientific misconduct: A review of the literature on retracted journal articles , 2016, Current sociology. La Sociologie contemporaine.

[9]  Hany Farid,et al.  A Longitudinal Analysis of YouTube's Promotion of Conspiracy Videos , 2020, ArXiv.

[10]  Krishna P. Gummadi,et al.  Search bias quantification: investigating political bias in social media and web search , 2018, Information Retrieval Journal.

[11]  Jeanna Neefe Matthews,et al.  The Right To Confront Your Accusers: Opening the Black Box of Forensic DNA Software , 2019, AIES.

[12]  Emily Denton,et al.  Towards a critical race methodology in algorithmic fairness , 2019, FAT*.

[13]  Christo Wilson,et al.  Investigating the Impact of Gender on Rank in Resume Search Engines , 2018, CHI.

[14]  Tanushree Mitra,et al.  Measuring Misinformation in Video Search Platforms: An Audit Study on YouTube , 2020, Proc. ACM Hum. Comput. Interact..

[15]  Hallvard Moe Comparing Platform “Ranking Cultures” Across Languages: The Case of Islam on YouTube in Scandinavia , 2019, Social Media + Society.

[16]  Seeta Peña Gangadharan,et al.  Decentering technology in discourse on discrimination* , 2019, Information, Communication & Society.

[17]  Benjamin B. Bederson,et al.  Human computation: a survey and taxonomy of a growing field , 2011, CHI.

[18]  Michael Carl Tschantz,et al.  Discrimination in Online Advertising: A Multidisciplinary Inquiry , 2018 .

[19]  Karrie Karahalios,et al.  Auditing Algorithms : Research Methods for Detecting Discrimination on Internet Platforms , 2014 .

[20]  Jeroen van den Hoven,et al.  Breaking the filter bubble: democracy and design , 2015, Ethics and Information Technology.

[21]  Karrie Karahalios,et al.  "Be Careful; Things Can Be Worse than They Appear": Understanding Biased Algorithms and Users' Behavior Around Them in Rating Platforms , 2017, ICWSM.

[22]  A. Tutt An FDA for Algorithms , 2016 .

[23]  Piotr Sapiezynski,et al.  Ad Delivery Algorithms: The Hidden Arbiters of Political Messaging , 2019, WSDM.

[24]  Brent J. Hecht,et al.  Measuring the Importance of User-Generated Content to Search Engines , 2019, ICWSM.

[25]  Christo Wilson,et al.  An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace , 2016, WWW.

[26]  Grant Duwe,et al.  Out With the Old and in With the New? An Empirical Comparison of Supervised Learning Algorithms to Predict Recidivism , 2017 .

[27]  Cornelius Puschmann,et al.  Beyond the Bubble: Assessing the Diversity of Political Search Results , 2018, Digital Journalism.

[28]  Jahna Otterbacher,et al.  Fairness in Proprietary Image Tagging Algorithms: A Cross-Platform Audit on People Images , 2019, ICWSM.

[29]  J. Landay,et al.  SearchMedia and Elections : A Longitudinal Investigation of Political Search Results in the 2018 U . S . Elections , 2019 .

[30]  Tarleton Gillespie,et al.  Algorithmically recognizable: Santorum’s Google problem, and Google’s Santorum problem , 2017, The Social Power of Algorithms.

[31]  Brent J. Hecht,et al.  “Data Strikes”: Evaluating the Effectiveness of a New Form of Collective Action Against Technology Companies , 2019, WWW.

[32]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[33]  Damian Trilling,et al.  Do not blame it on the algorithm: an empirical assessment of multiple recommender systems and their impact on content diversity , 2018 .

[34]  Vijay Erramilli,et al.  Detecting price and search discrimination on the internet , 2012, HotNets-XI.

[35]  Inioluwa Deborah Raji,et al.  Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products , 2019, AIES.

[36]  Eni Mustafaraj,et al.  The case for voter-centered audits of search engines during political elections , 2020, FAT*.

[37]  Virgílio A. F. Almeida,et al.  Auditing radicalization pathways on YouTube , 2019, FAT*.

[38]  David Moher,et al.  All in the Family: systematic reviews, rapid reviews, scoping reviews, realist reviews, and more , 2015, Systematic Reviews.

[39]  Shakir Mohamed,et al.  Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence , 2020, Philosophy & Technology.

[40]  D. Fitch,et al.  Review of "Algorithms of oppression: how search engines reinforce racism," by Noble, S. U. (2018). New York, New York: NYU Press. , 2018, CDQR.

[41]  Philip Gillingham,et al.  Predictive Risk Modelling to Prevent Child Maltreatment and Other Adverse Outcomes for Service Users: Inside the ‘Black Box’ of Machine Learning , 2015, British journal of social work.

[42]  Maranke Wieringa,et al.  What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability , 2020, FAT*.

[43]  Pablo J. Boczkowski,et al.  The Relevance of Algorithms , 2013 .

[44]  Johannes Schöning,et al.  It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process , 2021, ArXiv.

[45]  N. Jude Race After Technology , 2021, Journal of Technology in Human Services.

[46]  Kokil Jaidka,et al.  Auditing local news presence on Google News , 2020, Nature Human Behaviour.

[47]  Nicholas Diakopoulos,et al.  Auditing News Curation Systems: A Case Study Examining Algorithmic and Editorial Logic in Apple News , 2019, ICWSM.

[48]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[49]  Sahin Cem Geyik,et al.  Fairness-Aware Ranking in Search & Recommendation Systems with Application to LinkedIn Talent Search , 2019, KDD.

[50]  Christo Wilson,et al.  Quantity vs. Quality: Evaluating User Interest Profiles Using Ad Preference Managers , 2019, NDSS.

[51]  David Lazer,et al.  Measuring Price Discrimination and Steering on E-commerce Web Sites , 2014, Internet Measurement Conference.

[52]  N. Diakopoulos,et al.  Whose Walkability?: Challenges in Algorithmically Measuring Subjective Experience , 2019 .

[53]  Krishna P. Gummadi,et al.  Investigating Ad Transparency Mechanisms in Social Media: A Case Study of Facebooks Explanations , 2018, NDSS.

[54]  Markus Luczak-Rösch,et al.  You can't see what you can't see: Experimental evidence for how much relevant information may be missed due to Google's Web search personalisation , 2019, SocInfo.

[55]  J. Kleinberg,et al.  Roles for computing in social change , 2019, FAT*.

[56]  David García,et al.  Bias in Online Freelance Marketplaces: Evidence from TaskRabbit and Fiverr , 2017, CSCW.

[57]  R. Kitchin,et al.  Thinking critically about and researching algorithms , 2014, The Social Power of Algorithms.

[58]  Christopher T. Lowenkamp,et al.  False Positives, False Negatives, and False Analyses: A Rejoinder to "Machine Bias: There's Software Used across the Country to Predict Future Criminals. and It's Biased against Blacks" , 2016 .

[59]  Helen Nissenbaum,et al.  Shaping the Web: Why the Politics of Search Engines Matters , 2000, Inf. Soc..

[60]  Tony Doyle,et al.  Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy , 2017, Inf. Soc..

[61]  A. Hoffmann Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse , 2019, Information, Communication & Society.

[62]  Sean A. Munson,et al.  Unequal Representation and Gender Stereotypes in Image Search Results for Occupations , 2015, CHI.

[63]  Shoshana Zuboff,et al.  Big other: surveillance capitalism and the prospects of an information civilization , 2015, J. Inf. Technol..

[64]  Matthew P. Hitt,et al.  Newspaper Closures Polarize Voting Behavior , 2018 .

[65]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[66]  Olfa Nasraoui,et al.  PrCP: Pre-recommendation Counter-Polarization , 2018, KDIR.

[67]  Jonathan Grudin,et al.  Creepy but inevitable?: the evolution of social networking , 2014, CSCW.

[68]  Michelle Alexander,et al.  The New Jim Crow: Mass Incarceration in the Age of Colorblindness A Case Study on the Role of Books in Leveraging Social Change , 2014 .

[69]  Anne Marie Piper,et al.  Addressing Age-Related Bias in Sentiment Analysis , 2018, CHI.

[70]  Eni Mustafaraj,et al.  Opening Up the Black Box: Auditing Google's Top Stories Algorithm , 2019, FLAIRS.

[71]  Lorena Cano-Orón Dr. Google, what can you tell me about homeopathy? Comparative study of the top10 websites in the United States, United Kingdom, France, Mexico and Spain , 2019 .

[72]  Laura A. Dabbish,et al.  Working with Machines: The Impact of Algorithmic and Data-Driven Management on Human Workers , 2015, CHI.

[73]  Seth C. Lewis,et al.  What kind of news gatekeepers do we want machines to be? Filter bubbles, fragmentation, and the normative dimensions of algorithmic recommendations , 2019, Comput. Hum. Behav..

[74]  Michael Carl Tschantz,et al.  Automated Experiments on Ad Privacy Settings , 2014, Proc. Priv. Enhancing Technol..

[75]  Rubén Cuevas Rumín,et al.  FDVT: Data Valuation Tool for Facebook Users , 2017, CHI.

[76]  David Lazer,et al.  Location, Location, Location: The Impact of Geolocation on Web Search Personalization , 2015, Internet Measurement Conference.

[77]  J. Söderberg Media Technologies - Essays on Communication, Materiality, and Society , 2014 .

[78]  Alexander van Deursen,et al.  The relation between 21st-century skills and digital skills: A systematic literature review , 2017, Comput. Hum. Behav..

[79]  Shaowen Bardzell,et al.  Social Justice and Design: Power and oppression in collaborative systems , 2017, CSCW Companion.

[80]  Michael Veale,et al.  Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation , 2017, SocInfo.

[81]  Virginia E. Eubanks Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018 .

[82]  Sasha Costanza-Chock Design Justice: Towards an Intersectional Feminist Framework for Design Theory and Practice , 2018, DRS2018: Catalyst.

[83]  Sarah Myers West,et al.  Data Capitalism: Redefining the Logics of Surveillance and Privacy , 2019 .

[84]  Laurens van der Maaten,et al.  Does Object Recognition Work for Everyone? , 2019, CVPR Workshops.

[85]  Jeffrey S. Saltz,et al.  Data science ethical considerations: a systematic literature review and proposed project framework , 2019, Ethics and Information Technology.

[86]  David Lazer,et al.  Auditing Partisan Audience Bias within Google Search , 2018, Proc. ACM Hum. Comput. Interact..

[87]  G. Duwe Better Practices in the Development and Validation of Recidivism Risk Assessments: The Minnesota Sex Offender Screening Tool–4 , 2019 .

[88]  Mariarosaria Taddeo,et al.  The ethics of algorithms: Mapping the debate , 2016, Big Data Soc..

[89]  R. Lanfear,et al.  The Extent and Consequences of P-Hacking in Science , 2015, PLoS biology.

[90]  Alexandra Chouldechova,et al.  Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting , 2019, FAT.

[91]  Michael A. DeVito,et al.  FROM EDITORS TO ALGORITHMS A Values-Based Approach to Understanding Story Selection in the Facebook News Feed , 2016 .

[92]  Hannah Lebovits Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor , 2018, Public Integrity.

[93]  Nicholas Diakopoulos,et al.  Search as News Curator: The Role of Google in Shaping Attention to News Information , 2019, CHI.

[94]  Johannes Wachs,et al.  Gender differences in participation and reward on Stack Overflow , 2018, Empirical Software Engineering.

[95]  Krishna P. Gummadi,et al.  Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media , 2017, CSCW.

[96]  Vijay Erramilli,et al.  Crowd-assisted search for price discrimination in e-commerce: first results , 2013, CoNEXT.

[97]  Lada A. Adamic,et al.  Exposure to ideologically diverse news and opinion on Facebook , 2015, Science.

[98]  Mohan S. Kankanhalli,et al.  Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda , 2018, CHI.

[99]  Helen Nissenbaum,et al.  How computer systems embody values , 2001, Computer.

[100]  Alex Mihailidis,et al.  Algorithmic Bias in Clinical Populations—Evaluating and Improving Facial Analysis Technology in Older Adults With Dementia , 2019, IEEE Access.

[101]  Nicola Dell,et al.  The Ins and Outs of HCI for Development , 2016, CHI.

[102]  Jahna Otterbacher,et al.  Social B(eye)as: Human and Machine Descriptions of People Images , 2019, ICWSM.

[103]  Max L. Wilson,et al.  RepliCHI: the workshop , 2013, CHI Extended Abstracts.

[104]  C. Gomez-Uribe,et al.  The Netflix Recommender System: Algorithms, Business Value, and Innovation , 2016, ACM Trans. Manag. Inf. Syst..

[105]  Anja Bechmann,et al.  Are We Exposed to the Same “News” in the News Feed? , 2018, Digital Journalism.

[106]  Jaeyoung Choi,et al.  The Accuracy of the Demographic Inferences Shown on Google's Ad Settings , 2018, WPES@CCS.

[107]  Christo Wilson,et al.  Bias Misperceived: The Role of Partisanship and Misinformation in YouTube Comment Moderation , 2019, ICWSM.

[108]  Heinrich Hußmann,et al.  When people and algorithms meet: user-reported problems in intelligent everyday applications , 2019, IUI.

[109]  Ryen W. White,et al.  Seeking and sharing health information online: comparing search engines and social media , 2014, CHI.

[110]  Matthew E Falagas,et al.  Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses , 2007, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[111]  Christo Wilson,et al.  Who's the Guinea Pig?: Investigating Online A/B/n Tests in-the-Wild , 2019, FAT.

[112]  Krishna P. Gummadi,et al.  Auditing Offline Data Brokers via Facebook's Advertising Platform , 2019, WWW.

[113]  N. Couldry,et al.  Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject , 2018, Television & New Media.

[114]  G. King,et al.  Facebook Privacy-Protected Full URLs Data Set , 2020 .

[115]  Cynthia Breazeal,et al.  Machine behaviour , 2019, Nature.

[116]  Evaggelia Pitoura,et al.  Identifying Bias in Name Matching Tasks , 2019, EDBT.

[117]  Balachander Krishnamurthy,et al.  Measuring personalization of web search , 2013, WWW.

[118]  Niloy Ganguly,et al.  Analyzing the News Coverage of Personalized Newspapers , 2018, 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[119]  Angela D. R. Smith,et al.  Critical Race Theory for HCI , 2020, CHI.

[120]  Yang Wang,et al.  Smart, useful, scary, creepy: perceptions of online behavioral advertising , 2012, SOUPS.

[121]  Karrie Karahalios,et al.  Auditing Race and Gender Discrimination in Online Housing Markets , 2020, ICWSM.

[122]  Meredith Durbin,et al.  A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry , 2019, CHI Extended Abstracts.

[123]  S. Harding The feminist standpoint theory reader : intellectual andpolitical controversies , 2004 .

[124]  Rubén Cuevas Rumín,et al.  Unveiling and Quantifying Facebook Exploitation of Sensitive Personal Data for Advertising Purposes , 2018, USENIX Security Symposium.

[125]  Cédric Courtois,et al.  Challenging Google Search filter bubbles in social and political information: Disconforming evidence from a digital methods case study , 2018, Telematics Informatics.

[126]  P. Snickars More of the Same – On Spotify Radio , 2017 .

[127]  Luis A. Guerrero,et al.  Awareness Supporting Technologies used in Collaborative Systems: A Systematic Literature Review , 2017, CSCW.

[128]  Inioluwa Deborah Raji,et al.  Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing , 2020, FAT*.

[129]  Thorsten Joachims,et al.  Eye-tracking analysis of user behavior in WWW search , 2004, SIGIR '04.

[130]  Ariadna Matamoros-Fernández,et al.  From ranking algorithms to ‘ranking cultures’ , 2018 .

[131]  David Lazer,et al.  Auditing Autocomplete: Suggestion Networks and Recursive Algorithm Interrogation , 2019, WebSci.

[132]  David Lazer,et al.  Suppressing the Search Engine Manipulation Effect (SEME) , 2017, Proc. ACM Hum. Comput. Interact..

[133]  Thorsten Holz,et al.  An Empirical Study on Online Price Differentiation , 2018, CODASPY.

[134]  Seth Neel,et al.  Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness , 2017, ICML.

[135]  Natalia Kovalyova,et al.  Data feminism , 2020, Information, Communication & Society.

[136]  Emilia Gómez,et al.  Why Machine Learning May Lead to Unfairness: Evidence from Risk Assessment for Juvenile Justice in Catalonia , 2019, ICAIL.

[137]  Yejin Choi,et al.  The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.