Harms from Increasingly Agentic Algorithmic Systems

Research in Fairness, Accountability, Transparency, and Ethics (FATE) has established many sources and forms of algorithmic harm, in domains as diverse as health care, finance, policing, and recommendations. Much work remains to be done to mitigate the serious harms of these systems, particularly those disproportionately affecting marginalized communities. Despite these ongoing harms, new systems are being developed and deployed which threaten the perpetuation of the same harms and the creation of novel ones. In response, the FATE community has emphasized the importance of anticipating harms. Our work focuses on the anticipation of harms from increasingly agentic systems. Rather than providing a definition of agency as a binary property, we identify 4 key characteristics which, particularly in combination, tend to increase the agency of a given algorithmic system: underspecification, directness of impact, goal-directedness, and long-term planning. We also discuss important harms which arise from increasing agency -- notably, these include systemic and/or long-range impacts, often on marginalized stakeholders. We emphasize that recognizing agency of algorithmic systems does not absolve or shift the human responsibility for algorithmic harms. Rather, we use the term agency to highlight the increasingly evident fact that ML systems are not fully under human control. Our work explores increasingly agentic algorithmic systems in three parts. First, we explain the notion of an increase in agency for algorithmic systems in the context of diverse perspectives on agency across disciplines. Second, we argue for the need to anticipate harms from increasingly agentic systems. Third, we discuss important harms from increasingly agentic systems and ways forward for addressing them. We conclude by reflecting on implications of our work for anticipating algorithmic harms from emerging systems.

[1]  Dan Hendrycks,et al.  Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark , 2023, ArXiv.

[2]  Yonadav Shavit What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring , 2023, ArXiv.

[3]  Divya Siddarth,et al.  Generative AI and the Digital Commons , 2023, ArXiv.

[4]  David Krueger,et al.  Characterizing Manipulation from AI Systems , 2023, ArXiv.

[5]  J. Schulman,et al.  Scaling laws for single-agent reinforcement learning , 2023, ArXiv.

[6]  Chelsea Finn,et al.  A Survey of Meta-Reinforcement Learning , 2023, ArXiv.

[7]  Feryal M. P. Behbahani,et al.  Human-Timescale Adaptation in an Open-Ended Task Space , 2023, ICML.

[8]  Jimmy Ba,et al.  Mastering Diverse Domains through World Models , 2023, ArXiv.

[9]  Pascale Fung,et al.  Survey of Hallucination in Natural Language Generation , 2022, ACM Comput. Surv..

[10]  B. Far,et al.  Reinforcement Learning based Recommender Systems: A Survey , 2021, ACM Comput. Surv..

[11]  Tom B. Brown,et al.  Discovering Language Model Behaviors with Model-Written Evaluations , 2022, ACL.

[12]  Been Kim,et al.  Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation , 2022, ICLR.

[13]  Jacob Andreas Language Models as Agent Models , 2022, EMNLP.

[14]  Alexander H. Miller,et al.  Human-level play in the game of Diplomacy by combining language models with strategic reasoning , 2022, Science.

[15]  David Krueger,et al.  Broken Neural Scaling Laws , 2022, ICLR.

[16]  J. Schulman,et al.  Scaling Laws for Reward Model Overoptimization , 2022, ICML.

[17]  Rohin Shah,et al.  Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals , 2022, ArXiv.

[18]  Dmitrii Krasheninnikov,et al.  Defining and Characterizing Reward Hacking , 2022, ArXiv.

[19]  Tom B. Brown,et al.  In-context Learning and Induction Heads , 2022, ArXiv.

[20]  R. Sutton,et al.  The Alberta Plan for AI Research , 2022, ArXiv.

[21]  A. Kalai,et al.  Using Large Language Models to Simulate Multiple Humans , 2022, ArXiv.

[22]  Seth Lazar Legitimacy, Authority, and the Political Value of Explanations , 2022, ArXiv.

[23]  Michael S. Bernstein,et al.  Social Simulacra: Creating Populated Prototypes for Social Computing Systems , 2022, UIST.

[24]  S. Piantadosi,et al.  Meaning without reference in large language models , 2022, ArXiv.

[25]  Peter R. Florence,et al.  Inner Monologue: Embodied Reasoning through Planning with Language Models , 2022, CoRL.

[26]  Aylin Caliskan,et al.  American == White in Multimodal Language-and-Image AI , 2022, AIES.

[27]  Sarah H. Cen,et al.  Mastering the game of Stratego with model-free multiagent reinforcement learning , 2022, Science.

[28]  S. Sreedharan,et al.  Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change) , 2022, ArXiv.

[29]  Inioluwa Deborah Raji,et al.  The Fallacy of AI Functionality , 2022, FAccT.

[30]  A. Chouldechova,et al.  Algorithmic Fairness and Vertical Equity: Income Fairness with IRS Tax Audit Models , 2022, FAccT.

[31]  Lisa Anne Hendricks,et al.  Taxonomy of Risks posed by Language Models , 2022, FAccT.

[32]  Bianca A. Lepe,et al.  Tech Worker Organizing for Power and Accountability , 2022, FAccT.

[33]  Michael A. Katell,et al.  Confronting Power and Corporate Capture at the FAccT Conference , 2022, FAccT.

[34]  T. S. Goetze Mind the Gap: Autonomous Systems, the Responsibility Gap, and Moral Entanglement , 2022, FAccT.

[35]  J. Dean,et al.  Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..

[36]  Mark O. Riedl,et al.  The Algorithmic Imprint , 2022, FAccT.

[37]  Adrian Weller,et al.  Transparency, Governance and Regulation of Algorithmic Tools Deployed in the Criminal Justice System: a UK Case Study , 2022, AIES.

[38]  Yingqiang Ge,et al.  Fairness in Recommendation: Foundations, Methods and Applications , 2022, ACM Transactions on Intelligent Systems and Technology.

[39]  Kori M. Inkpen,et al.  Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging , 2022, FAccT.

[40]  Zhiwei Steven Wu,et al.  Imagining new futures beyond predictive systems in child welfare: A qualitative study with impacted stakeholders , 2022, FAccT.

[41]  Sergio Gomez Colmenarejo,et al.  A Generalist Agent , 2022, Trans. Mach. Learn. Res..

[42]  J. Tenenbaum,et al.  Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks , 2022, ArXiv.

[43]  Christopher L. Dancy,et al.  The Forgotten Margins of AI Ethics , 2022, Conference on Fairness, Accountability and Transparency.

[44]  Dragomir R. Radev,et al.  Data Governance in the Age of Large-Scale Data-Driven Language Technology , 2022, FAccT.

[45]  Oriol Vinyals,et al.  Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.

[46]  Stuart J. Russell,et al.  Estimating and Penalizing Induced Preference Shifts in Recommender Systems , 2022, ICML.

[47]  Aaron J. Snoswell,et al.  Reward Reports for Reinforcement Learning , 2022, AIES.

[48]  Tom Everitt,et al.  Path-Specific Objectives for Safer Agent Incentives , 2022, AAAI.

[49]  Adrian S. Wong,et al.  Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language , 2022, ICLR.

[50]  Jacob Menick,et al.  Teaching language models to support answers with verified quotes , 2022, ArXiv.

[51]  Tom B. Brown,et al.  Predictability and Surprise in Large Generative Models , 2022, FAccT.

[52]  H. Nissenbaum,et al.  Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning , 2022, FAccT.

[53]  Yulia W. Sullivan,et al.  Moral Judgments in the Age of Artificial Intelligence , 2022, Journal of Business Ethics.

[54]  P. Braveman,et al.  Systemic And Structural Racism: Definitions, Examples, Health Damages, And Approaches To Dismantling. , 2022, Health affairs.

[55]  Dale Schuurmans,et al.  Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.

[56]  J. Steinhardt,et al.  The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models , 2022, ICLR.

[57]  Sam Bowman The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail , 2021, ACL.

[58]  Owain Evans,et al.  TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.

[59]  David Krueger,et al.  Goal Misgeneralization in Deep Reinforcement Learning , 2021, ICML.

[60]  Lisa Anne Hendricks,et al.  An empirical analysis of compute-optimal large language model training , 2022, NeurIPS.

[61]  Negar Rostamzadeh,et al.  Sociotechnical Harms: Scoping a Taxonomy for Harm Reduction , 2022, ArXiv.

[62]  Open Rights Group response to the DCMS policy paper “Establishing a pro-innovation approach to regulating AI” , 2022 .

[63]  Carla Zoe Cremer,et al.  Democratising Risk: In Search of a Methodology to Study Existential Risk , 2021, 2201.11214.

[64]  Jeff Wu,et al.  WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.

[65]  Po-Sen Huang,et al.  Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.

[66]  Amandalynne Paullada,et al.  AI and the Everything in the Whole Wide World Benchmark , 2021, NeurIPS Datasets and Benchmarks.

[67]  Pieter Abbeel,et al.  Mastering Atari Games with Limited Data , 2021, NeurIPS.

[68]  A. Kasirzadeh,et al.  User Tampering in Reinforcement Learning Recommender Systems , 2021, ArXiv.

[69]  Michael S. Bernstein,et al.  On the Opportunities and Risks of Foundation Models , 2021, ArXiv.

[70]  David C. Parkes,et al.  The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning , 2021, ArXiv.

[71]  Nancy Green,et al.  An AI Ethics Course Highlighting Explicit Ethical Agents , 2021, AIES.

[72]  Oriol Vinyals,et al.  Highly accurate protein structure prediction with AlphaFold , 2021, Nature.

[73]  Fernando Diaz,et al.  The Benchmark Lottery , 2021, ArXiv.

[74]  Joe Whittaker,et al.  Recommender systems and the amplification of extremist content , 2021, Internet Policy Rev..

[75]  Pieter Abbeel,et al.  Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.

[76]  J. Burrell,et al.  The Society of Algorithms , 2021, Annual Review of Sociology.

[77]  Michele Loi,et al.  Towards Accountability in the Use of Artificial Intelligence for Public Administrations , 2021, AIES.

[78]  Emily M. Bender,et al.  On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.

[79]  Rediet Abebe,et al.  Fairness, Equality, and Power in Algorithmic Decision-Making , 2021, FAccT.

[80]  Pedro A. Ortega,et al.  Agent Incentives: A Causal Perspective , 2021, AAAI.

[81]  James Zou,et al.  Persistent Anti-Muslim Bias in Large Language Models , 2021, AIES.

[82]  Himabindu Lakkaraju,et al.  Does Fair Ranking Improve Minority Outcomes? Understanding the Interplay of Human and Algorithmic Biases in Online Hiring , 2020, AIES.

[83]  Mohamed Abdalla,et al.  The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity , 2020, AIES.

[84]  Mingyan Liu,et al.  Fairness in Learning-Based Sequential Decision Algorithms: A Survey , 2020, Handbook of Reinforcement Learning and Control.

[85]  Hanna M. Wallach,et al.  Measurement and Fairness , 2019, FAccT.

[86]  Timnit Gebru,et al.  Datasheets for datasets , 2018, Commun. ACM.

[87]  Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , 2021 .

[88]  Hanna M. Wallach,et al.  Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets , 2021, ACL.

[89]  Joel Z. Leibo,et al.  Open Problems in Cooperative AI , 2020, ArXiv.

[90]  M. Pasquinelli,et al.  The Nooscope manifested: AI as instrument of knowledge extractivism , 2020, AI & SOCIETY.

[91]  Yejin Choi,et al.  RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.

[92]  Tegan Maharaj,et al.  Hidden Incentives for Auto-Induced Distributional Shift , 2020, ArXiv.

[93]  Mona Simion,et al.  Down Girl: The Logic of Misogyny , 2020, The Philosophical Quarterly.

[94]  Emily M. Bender,et al.  Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data , 2020, ACL.

[95]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[96]  Peter Henderson,et al.  Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.

[97]  Mariarosaria Taddeo,et al.  Recommender systems and their ethical challenges , 2020, AI & SOCIETY.

[98]  Emma J. Chory,et al.  A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.

[99]  Simone Natale,et al.  Imagining the thinking machine: Technological myths and the rise of artificial intelligence , 2020 .

[100]  Benjamin Fish,et al.  The effects of competition and regulation on error inequality in data-driven markets , 2020, FAT*.

[101]  Ben Green,et al.  The false promise of risk assessments: epistemic reform and the limits of fairness , 2020, FAT*.

[102]  Maranke Wieringa,et al.  What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability , 2020, FAT*.

[103]  Alec Radford,et al.  Scaling Laws for Neural Language Models , 2020, ArXiv.

[104]  K. Crawford,et al.  Enchanted Determinism: Power without Responsibility in Artificial Intelligence , 2020 .

[105]  J. Kleinberg,et al.  Roles for computing in social change , 2019, FAT*.

[106]  Michael Gao,et al.  "The human body is a black box": supporting clinical decision-making with deep learning , 2019, FAT*.

[107]  Demis Hassabis,et al.  Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.

[108]  Virgílio A. F. Almeida,et al.  Auditing radicalization pathways on YouTube , 2019, FAT*.

[109]  Dylan S. Shah,et al.  Autonomous Robots From Biological Inspiration To Implementation And Control Intelligent Robotics And Autonomous , 2020 .

[110]  Jevgenij Gamper,et al.  AI Ethics for Systemic Issues: A Structural Approach , 2019, ArXiv.

[111]  Noam Brown,et al.  Superhuman AI for multiplayer poker , 2019, Science.

[112]  Sameer Singh,et al.  Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[113]  Andrea L. Guzman,et al.  Artificial Intelligence and Journalism , 2019, Journalism & Mass Communication Quarterly.

[114]  Mary L. Gray,et al.  Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass , 2019 .

[115]  M. Brannick,et al.  Is social network site usage related to depression? A meta-analysis of Facebook-depression relations. , 2019, Journal of affective disorders.

[116]  A. Grealish,et al.  A systematic review: the influence of social media on depression, anxiety and psychological distress in adolescents , 2019, International Journal of Adolescence and Youth.

[117]  Qi Wang,et al.  Social media addiction: Its impact, mediation, and intervention , 2019, Cyberpsychology: Journal of Psychosocial Research on Cyberspace.

[118]  Sendhil Mullainathan,et al.  Dissecting Racial Bias in an Algorithm that Guides Health Decisions for 70 Million People , 2019, FAT.

[119]  Danah Boyd,et al.  Fairness and Abstraction in Sociotechnical Systems , 2019, FAT.

[120]  Tor Lattimore,et al.  Degenerate Feedback Loops in Recommender Systems , 2019, AIES.

[121]  Inioluwa Deborah Raji,et al.  Model Cards for Model Reporting , 2018, FAT.

[122]  Dylan Hadfield-Menell,et al.  Incomplete Contracting and AI Alignment , 2018, AIES.

[123]  Xiaohui Ye,et al.  Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.

[124]  Kira Goldner,et al.  Mechanism design for social good , 2018, SIGAI.

[125]  Studying up? , 2018, Mobile Entrepreneurs.

[126]  Laurent Orseau,et al.  Agents and Devices: A Relative Definition of Agency , 2018, ArXiv.

[127]  Esther Rolf,et al.  Delayed Impact of Fair Machine Learning , 2018, ICML.

[128]  Maria Soledad Pera,et al.  All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Effectiveness , 2018, FAT.

[129]  Timnit Gebru,et al.  Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.

[130]  Joichi Ito,et al.  Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment , 2017, FAT.

[131]  Demis Hassabis,et al.  Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.

[132]  Deborah G. Johnson,et al.  Reframing AI Discourse , 2017, Minds and Machines.

[133]  Jesse M. Shapiro,et al.  Greater Internet use is not associated with faster growth in political polarization among US demographic groups , 2017, Proceedings of the National Academy of Sciences.

[134]  Wentao Yu,et al.  Applications of artificial intelligence in intelligent manufacturing: a review , 2017, Frontiers of Information Technology & Electronic Engineering.

[135]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[136]  Aaron Roth,et al.  Fairness in Reinforcement Learning , 2016, ICML.

[137]  Aidong Zhang,et al.  A Survey on Context Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[138]  Fernando Diaz,et al.  Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI , 2016 .

[139]  Aaron Roth,et al.  Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.

[140]  Joshua B. Tenenbaum,et al.  Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.

[141]  A. Tutt An FDA for Algorithms , 2016 .

[142]  Alan R. Wagner,et al.  Overtrust of robots in emergency evacuation scenarios , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[143]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[144]  Philip E. Tetlock,et al.  Superforecasting: The Art and Science of Prediction , 2015 .

[145]  J. Armour,et al.  Systemic Harms and Shareholder Value , 2014 .

[146]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[147]  M. Miller Agency , 2010 .

[148]  R. Azimi,et al.  , Why We Need a , 2010 .

[149]  Teresa Ribeiro,et al.  Technological Forecasting & Social Change Scenario planning in public policy : Understanding use , impacts and the role of institutional context factors ☆ , 2009 .

[150]  Stephen M. Omohundro,et al.  The Basic AI Drives , 2008, AGI.

[151]  S. Wyatt Technological determinism is dead; Long live technological determinism. , 2008 .

[152]  Mustafa Emirbayer What Is Agency ? ' , 2008 .

[153]  Katherine D. Kinzler,et al.  Core knowledge. , 2007, Developmental science.

[154]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[155]  Rémi Coulom,et al.  Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .

[156]  C. Tilly Democracy , 2001, BMJ : British Medical Journal.

[157]  H. Nissenbaum Accountability in a computerized society , 1997 .

[158]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[159]  S. Brison The Intentional Stance , 1989 .

[160]  K. Eisenhardt Agency Theory: An Assessment and Review , 1989 .

[161]  C. Goodhart Problems of Monetary Management: The UK Experience , 1984 .

[162]  M. C. Jensen,et al.  Harvard Business School; SSRN; National Bureau of Economic Research (NBER); European Corporate Governance Institute (ECGI); Harvard University - Accounting & Control Unit , 1976 .

[163]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[164]  T. Kuhn,et al.  The Structure of Scientific Revolutions. , 1964 .