Harms from Increasingly Agentic Algorithmic Systems
暂无分享,去创建一个
Dmitrii Krasheninnikov | David Krueger | Tegan Maharaj | John Burden | Adrian Weller | Umang Bhatt | Micah Carroll | Shalaleh Rismani | Konstantinos Voudouris | A. Mayhew | Wanru Zhao | Alan Chan | Nitarshan Rajkumar | Rebecca Salganik | Katherine Collins | Yawen Duan | Alva Markelius | Chris Pang | L. Langosco | Zhonghao He | Michelle Lin | Maryam Molamohammadi
[1] Dan Hendrycks,et al. Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark , 2023, ArXiv.
[2] Yonadav Shavit. What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring , 2023, ArXiv.
[3] Divya Siddarth,et al. Generative AI and the Digital Commons , 2023, ArXiv.
[4] David Krueger,et al. Characterizing Manipulation from AI Systems , 2023, ArXiv.
[5] J. Schulman,et al. Scaling laws for single-agent reinforcement learning , 2023, ArXiv.
[6] Chelsea Finn,et al. A Survey of Meta-Reinforcement Learning , 2023, ArXiv.
[7] Feryal M. P. Behbahani,et al. Human-Timescale Adaptation in an Open-Ended Task Space , 2023, ICML.
[8] Jimmy Ba,et al. Mastering Diverse Domains through World Models , 2023, ArXiv.
[9] Pascale Fung,et al. Survey of Hallucination in Natural Language Generation , 2022, ACM Comput. Surv..
[10] B. Far,et al. Reinforcement Learning based Recommender Systems: A Survey , 2021, ACM Comput. Surv..
[11] Tom B. Brown,et al. Discovering Language Model Behaviors with Model-Written Evaluations , 2022, ACL.
[12] Been Kim,et al. Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation , 2022, ICLR.
[13] Jacob Andreas. Language Models as Agent Models , 2022, EMNLP.
[14] Alexander H. Miller,et al. Human-level play in the game of Diplomacy by combining language models with strategic reasoning , 2022, Science.
[15] David Krueger,et al. Broken Neural Scaling Laws , 2022, ICLR.
[16] J. Schulman,et al. Scaling Laws for Reward Model Overoptimization , 2022, ICML.
[17] Rohin Shah,et al. Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals , 2022, ArXiv.
[18] Dmitrii Krasheninnikov,et al. Defining and Characterizing Reward Hacking , 2022, ArXiv.
[19] Tom B. Brown,et al. In-context Learning and Induction Heads , 2022, ArXiv.
[20] R. Sutton,et al. The Alberta Plan for AI Research , 2022, ArXiv.
[21] A. Kalai,et al. Using Large Language Models to Simulate Multiple Humans , 2022, ArXiv.
[22] Seth Lazar. Legitimacy, Authority, and the Political Value of Explanations , 2022, ArXiv.
[23] Michael S. Bernstein,et al. Social Simulacra: Creating Populated Prototypes for Social Computing Systems , 2022, UIST.
[24] S. Piantadosi,et al. Meaning without reference in large language models , 2022, ArXiv.
[25] Peter R. Florence,et al. Inner Monologue: Embodied Reasoning through Planning with Language Models , 2022, CoRL.
[26] Aylin Caliskan,et al. American == White in Multimodal Language-and-Image AI , 2022, AIES.
[27] Sarah H. Cen,et al. Mastering the game of Stratego with model-free multiagent reinforcement learning , 2022, Science.
[28] S. Sreedharan,et al. Large Language Models Still Can't Plan (A Benchmark for LLMs on Planning and Reasoning about Change) , 2022, ArXiv.
[29] Inioluwa Deborah Raji,et al. The Fallacy of AI Functionality , 2022, FAccT.
[30] A. Chouldechova,et al. Algorithmic Fairness and Vertical Equity: Income Fairness with IRS Tax Audit Models , 2022, FAccT.
[31] Lisa Anne Hendricks,et al. Taxonomy of Risks posed by Language Models , 2022, FAccT.
[32] Bianca A. Lepe,et al. Tech Worker Organizing for Power and Accountability , 2022, FAccT.
[33] Michael A. Katell,et al. Confronting Power and Corporate Capture at the FAccT Conference , 2022, FAccT.
[34] T. S. Goetze. Mind the Gap: Autonomous Systems, the Responsibility Gap, and Moral Entanglement , 2022, FAccT.
[35] J. Dean,et al. Emergent Abilities of Large Language Models , 2022, Trans. Mach. Learn. Res..
[36] Mark O. Riedl,et al. The Algorithmic Imprint , 2022, FAccT.
[37] Adrian Weller,et al. Transparency, Governance and Regulation of Algorithmic Tools Deployed in the Criminal Justice System: a UK Case Study , 2022, AIES.
[38] Yingqiang Ge,et al. Fairness in Recommendation: Foundations, Methods and Applications , 2022, ACM Transactions on Intelligent Systems and Technology.
[39] Kori M. Inkpen,et al. Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging , 2022, FAccT.
[40] Zhiwei Steven Wu,et al. Imagining new futures beyond predictive systems in child welfare: A qualitative study with impacted stakeholders , 2022, FAccT.
[41] Sergio Gomez Colmenarejo,et al. A Generalist Agent , 2022, Trans. Mach. Learn. Res..
[42] J. Tenenbaum,et al. Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks , 2022, ArXiv.
[43] Christopher L. Dancy,et al. The Forgotten Margins of AI Ethics , 2022, Conference on Fairness, Accountability and Transparency.
[44] Dragomir R. Radev,et al. Data Governance in the Age of Large-Scale Data-Driven Language Technology , 2022, FAccT.
[45] Oriol Vinyals,et al. Flamingo: a Visual Language Model for Few-Shot Learning , 2022, NeurIPS.
[46] Stuart J. Russell,et al. Estimating and Penalizing Induced Preference Shifts in Recommender Systems , 2022, ICML.
[47] Aaron J. Snoswell,et al. Reward Reports for Reinforcement Learning , 2022, AIES.
[48] Tom Everitt,et al. Path-Specific Objectives for Safer Agent Incentives , 2022, AAAI.
[49] Adrian S. Wong,et al. Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language , 2022, ICLR.
[50] Jacob Menick,et al. Teaching language models to support answers with verified quotes , 2022, ArXiv.
[51] Tom B. Brown,et al. Predictability and Surprise in Large Generative Models , 2022, FAccT.
[52] H. Nissenbaum,et al. Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning , 2022, FAccT.
[53] Yulia W. Sullivan,et al. Moral Judgments in the Age of Artificial Intelligence , 2022, Journal of Business Ethics.
[54] P. Braveman,et al. Systemic And Structural Racism: Definitions, Examples, Health Damages, And Approaches To Dismantling. , 2022, Health affairs.
[55] Dale Schuurmans,et al. Chain of Thought Prompting Elicits Reasoning in Large Language Models , 2022, NeurIPS.
[56] J. Steinhardt,et al. The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models , 2022, ICLR.
[57] Sam Bowman. The Dangers of Underclaiming: Reasons for Caution When Reporting How NLP Systems Fail , 2021, ACL.
[58] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[59] David Krueger,et al. Goal Misgeneralization in Deep Reinforcement Learning , 2021, ICML.
[60] Lisa Anne Hendricks,et al. An empirical analysis of compute-optimal large language model training , 2022, NeurIPS.
[61] Negar Rostamzadeh,et al. Sociotechnical Harms: Scoping a Taxonomy for Harm Reduction , 2022, ArXiv.
[62] Open Rights Group response to the DCMS policy paper “Establishing a pro-innovation approach to regulating AI” , 2022 .
[63] Carla Zoe Cremer,et al. Democratising Risk: In Search of a Methodology to Study Existential Risk , 2021, 2201.11214.
[64] Jeff Wu,et al. WebGPT: Browser-assisted question-answering with human feedback , 2021, ArXiv.
[65] Po-Sen Huang,et al. Scaling Language Models: Methods, Analysis & Insights from Training Gopher , 2021, ArXiv.
[66] Amandalynne Paullada,et al. AI and the Everything in the Whole Wide World Benchmark , 2021, NeurIPS Datasets and Benchmarks.
[67] Pieter Abbeel,et al. Mastering Atari Games with Limited Data , 2021, NeurIPS.
[68] A. Kasirzadeh,et al. User Tampering in Reinforcement Learning Recommender Systems , 2021, ArXiv.
[69] Michael S. Bernstein,et al. On the Opportunities and Risks of Foundation Models , 2021, ArXiv.
[70] David C. Parkes,et al. The AI Economist: Optimal Economic Policy Design via Two-level Deep Reinforcement Learning , 2021, ArXiv.
[71] Nancy Green,et al. An AI Ethics Course Highlighting Explicit Ethical Agents , 2021, AIES.
[72] Oriol Vinyals,et al. Highly accurate protein structure prediction with AlphaFold , 2021, Nature.
[73] Fernando Diaz,et al. The Benchmark Lottery , 2021, ArXiv.
[74] Joe Whittaker,et al. Recommender systems and the amplification of extremist content , 2021, Internet Policy Rev..
[75] Pieter Abbeel,et al. Decision Transformer: Reinforcement Learning via Sequence Modeling , 2021, NeurIPS.
[76] J. Burrell,et al. The Society of Algorithms , 2021, Annual Review of Sociology.
[77] Michele Loi,et al. Towards Accountability in the Use of Artificial Intelligence for Public Administrations , 2021, AIES.
[78] Emily M. Bender,et al. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 , 2021, FAccT.
[79] Rediet Abebe,et al. Fairness, Equality, and Power in Algorithmic Decision-Making , 2021, FAccT.
[80] Pedro A. Ortega,et al. Agent Incentives: A Causal Perspective , 2021, AAAI.
[81] James Zou,et al. Persistent Anti-Muslim Bias in Large Language Models , 2021, AIES.
[82] Himabindu Lakkaraju,et al. Does Fair Ranking Improve Minority Outcomes? Understanding the Interplay of Human and Algorithmic Biases in Online Hiring , 2020, AIES.
[83] Mohamed Abdalla,et al. The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity , 2020, AIES.
[84] Mingyan Liu,et al. Fairness in Learning-Based Sequential Decision Algorithms: A Survey , 2020, Handbook of Reinforcement Learning and Control.
[85] Hanna M. Wallach,et al. Measurement and Fairness , 2019, FAccT.
[86] Timnit Gebru,et al. Datasheets for datasets , 2018, Commun. ACM.
[87] Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society , 2021 .
[88] Hanna M. Wallach,et al. Stereotyping Norwegian Salmon: An Inventory of Pitfalls in Fairness Benchmark Datasets , 2021, ACL.
[89] Joel Z. Leibo,et al. Open Problems in Cooperative AI , 2020, ArXiv.
[90] M. Pasquinelli,et al. The Nooscope manifested: AI as instrument of knowledge extractivism , 2020, AI & SOCIETY.
[91] Yejin Choi,et al. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models , 2020, FINDINGS.
[92] Tegan Maharaj,et al. Hidden Incentives for Auto-Induced Distributional Shift , 2020, ArXiv.
[93] Mona Simion,et al. Down Girl: The Logic of Misogyny , 2020, The Philosophical Quarterly.
[94] Emily M. Bender,et al. Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data , 2020, ACL.
[95] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[96] Peter Henderson,et al. Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.
[97] Mariarosaria Taddeo,et al. Recommender systems and their ethical challenges , 2020, AI & SOCIETY.
[98] Emma J. Chory,et al. A Deep Learning Approach to Antibiotic Discovery , 2020, Cell.
[99] Simone Natale,et al. Imagining the thinking machine: Technological myths and the rise of artificial intelligence , 2020 .
[100] Benjamin Fish,et al. The effects of competition and regulation on error inequality in data-driven markets , 2020, FAT*.
[101] Ben Green,et al. The false promise of risk assessments: epistemic reform and the limits of fairness , 2020, FAT*.
[102] Maranke Wieringa,et al. What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability , 2020, FAT*.
[103] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[104] K. Crawford,et al. Enchanted Determinism: Power without Responsibility in Artificial Intelligence , 2020 .
[105] J. Kleinberg,et al. Roles for computing in social change , 2019, FAT*.
[106] Michael Gao,et al. "The human body is a black box": supporting clinical decision-making with deep learning , 2019, FAT*.
[107] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[108] Virgílio A. F. Almeida,et al. Auditing radicalization pathways on YouTube , 2019, FAT*.
[109] Dylan S. Shah,et al. Autonomous Robots From Biological Inspiration To Implementation And Control Intelligent Robotics And Autonomous , 2020 .
[110] Jevgenij Gamper,et al. AI Ethics for Systemic Issues: A Structural Approach , 2019, ArXiv.
[111] Noam Brown,et al. Superhuman AI for multiplayer poker , 2019, Science.
[112] Sameer Singh,et al. Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.
[113] Andrea L. Guzman,et al. Artificial Intelligence and Journalism , 2019, Journalism & Mass Communication Quarterly.
[114] Mary L. Gray,et al. Ghost Work: How to Stop Silicon Valley from Building a New Global Underclass , 2019 .
[115] M. Brannick,et al. Is social network site usage related to depression? A meta-analysis of Facebook-depression relations. , 2019, Journal of affective disorders.
[116] A. Grealish,et al. A systematic review: the influence of social media on depression, anxiety and psychological distress in adolescents , 2019, International Journal of Adolescence and Youth.
[117] Qi Wang,et al. Social media addiction: Its impact, mediation, and intervention , 2019, Cyberpsychology: Journal of Psychosocial Research on Cyberspace.
[118] Sendhil Mullainathan,et al. Dissecting Racial Bias in an Algorithm that Guides Health Decisions for 70 Million People , 2019, FAT.
[119] Danah Boyd,et al. Fairness and Abstraction in Sociotechnical Systems , 2019, FAT.
[120] Tor Lattimore,et al. Degenerate Feedback Loops in Recommender Systems , 2019, AIES.
[121] Inioluwa Deborah Raji,et al. Model Cards for Model Reporting , 2018, FAT.
[122] Dylan Hadfield-Menell,et al. Incomplete Contracting and AI Alignment , 2018, AIES.
[123] Xiaohui Ye,et al. Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.
[124] Kira Goldner,et al. Mechanism design for social good , 2018, SIGAI.
[125] Studying up? , 2018, Mobile Entrepreneurs.
[126] Laurent Orseau,et al. Agents and Devices: A Relative Definition of Agency , 2018, ArXiv.
[127] Esther Rolf,et al. Delayed Impact of Fair Machine Learning , 2018, ICML.
[128] Maria Soledad Pera,et al. All The Cool Kids, How Do They Fit In?: Popularity and Demographic Biases in Recommender Evaluation and Effectiveness , 2018, FAT.
[129] Timnit Gebru,et al. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification , 2018, FAT.
[130] Joichi Ito,et al. Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment , 2017, FAT.
[131] Demis Hassabis,et al. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm , 2017, ArXiv.
[132] Deborah G. Johnson,et al. Reframing AI Discourse , 2017, Minds and Machines.
[133] Jesse M. Shapiro,et al. Greater Internet use is not associated with faster growth in political polarization among US demographic groups , 2017, Proceedings of the National Academy of Sciences.
[134] Wentao Yu,et al. Applications of artificial intelligence in intelligent manufacturing: a review , 2017, Frontiers of Information Technology & Electronic Engineering.
[135] M. Gentzkow,et al. Social Media and Fake News in the 2016 Election , 2017 .
[136] Aaron Roth,et al. Fairness in Reinforcement Learning , 2016, ICML.
[137] Aidong Zhang,et al. A Survey on Context Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.
[138] Fernando Diaz,et al. Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI , 2016 .
[139] Aaron Roth,et al. Fairness in Learning: Classic and Contextual Bandits , 2016, NIPS.
[140] Joshua B. Tenenbaum,et al. Building machines that learn and think like people , 2016, Behavioral and Brain Sciences.
[141] A. Tutt. An FDA for Algorithms , 2016 .
[142] Alan R. Wagner,et al. Overtrust of robots in emergency evacuation scenarios , 2016, 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI).
[143] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[144] Philip E. Tetlock,et al. Superforecasting: The Art and Science of Prediction , 2015 .
[145] J. Armour,et al. Systemic Harms and Shareholder Value , 2014 .
[146] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[147] M. Miller. Agency , 2010 .
[148] R. Azimi,et al. , Why We Need a , 2010 .
[149] Teresa Ribeiro,et al. Technological Forecasting & Social Change Scenario planning in public policy : Understanding use , impacts and the role of institutional context factors ☆ , 2009 .
[150] Stephen M. Omohundro,et al. The Basic AI Drives , 2008, AGI.
[151] S. Wyatt. Technological determinism is dead; Long live technological determinism. , 2008 .
[152] Mustafa Emirbayer. What Is Agency ? ' , 2008 .
[153] Katherine D. Kinzler,et al. Core knowledge. , 2007, Developmental science.
[154] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[155] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[156] C. Tilly. Democracy , 2001, BMJ : British Medical Journal.
[157] H. Nissenbaum. Accountability in a computerized society , 1997 .
[158] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[159] S. Brison. The Intentional Stance , 1989 .
[160] K. Eisenhardt. Agency Theory: An Assessment and Review , 1989 .
[161] C. Goodhart. Problems of Monetary Management: The UK Experience , 1984 .
[162] M. C. Jensen,et al. Harvard Business School; SSRN; National Bureau of Economic Research (NBER); European Corporate Governance Institute (ECGI); Harvard University - Accounting & Control Unit , 1976 .
[163] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.
[164] T. Kuhn,et al. The Structure of Scientific Revolutions. , 1964 .