Protecting Society from AI Misuse: When are Restrictions on Capabilities Warranted?

Artificial intelligence (AI) systems will increasingly be used to cause harm as they grow more capable. In fact, AI systems are already starting to be used to automate fraudulent activities, violate human rights, create harmful fake images, and identify dangerous toxins. To prevent some misuses of AI, we argue that targeted interventions on certain capabilities will be warranted. These restrictions may include controlling who can access certain types of AI models, what they can be used for, whether outputs are filtered or can be traced back to their user, and the resources needed to develop them. We also contend that some restrictions on non-AI capabilities needed to cause harm will be required. Though capability restrictions risk reducing use more than misuse (facing an unfavorable Misuse-Use Tradeoff), we argue that interventions on capabilities are warranted when other interventions are insufficient, the potential harm from misuse is high, and there are targeted ways to intervene on capabilities. We provide a taxonomy of interventions that can reduce AI misuse, focusing on the specific steps required for a misuse to cause harm (the Misuse Chain), and a framework to determine if an intervention is warranted. We apply this reasoning to three examples: predicting novel toxins, creating harmful images, and automating spear phishing campaigns.

[1]  Irene Solaiman The Gradient of Generative AI Release: Methods and Considerations , 2023, FAccT.

[2]  Girish Sastry,et al.  Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations , 2023, ArXiv.

[3]  Tom B. Brown,et al.  Constitutional AI: Harmlessness from AI Feedback , 2022, ArXiv.

[4]  E. Horvitz On the Horizon: Interactive and Compositional Deepfakes , 2022, ICMI.

[5]  S. Ekins,et al.  A teachable moment for dual-use , 2022, Nature Machine Intelligence.

[6]  V. Krishnamurthy With Great (Computing) Power Comes Great (Human Rights) Responsibility: Cloud Computing and Human Rights , 2022, Business and Human Rights Journal.

[7]  S. Ekins,et al.  Dual use of artificial-intelligence-powered drug discovery , 2022, Nature Machine Intelligence.

[8]  Tom B. Brown,et al.  Predictability and Surprise in Large Generative Models , 2022, FAccT.

[9]  Larry S. Davis,et al.  Responsible Disclosure of Generative Models Using Scalable Fingerprinting , 2020, ICLR.

[10]  A. Dafoe,et al.  Differential Technology Development: A Responsible Innovation Principle for Navigating Technology Risks , 2022, SSRN Electronic Journal.

[11]  Johannes Schöning,et al.  It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process , 2021, ArXiv.

[12]  Dario Amodei,et al.  A General Language Assistant as a Laboratory for Alignment , 2021, ArXiv.

[13]  Helena Webb,et al.  Institutionalizing ethics in AI through broader impact requirements , 2021, Nature Machine Intelligence.

[14]  D. Peterson Designing Alternatives to China’s Repressive Surveillance State , 2020 .

[15]  Peter Henderson,et al.  Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims , 2020, ArXiv.

[16]  Hannah Bloch-Wehba Automation in Moderation , 2020 .

[17]  Reuben Binns,et al.  Algorithmic content moderation: Technical and political challenges in the automation of platform governance , 2020, Big Data Soc..

[18]  Allan Dafoe,et al.  The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? , 2019, AIES.

[19]  Emma Llanso Artificial Intelligence, Content Moderation, and Freedom of Expression , 2020 .

[20]  M. Weiss Deepfake Bot Submissions to Federal Public Comment Websites Cannot Be Distinguished from Human Submissions , 2019 .

[21]  R. Portman,et al.  Abuses of the Federal Notice-And-Comment Rulemaking Process , 2019 .

[22]  A. Dafoe,et al.  How does the offense-defense balance scale? , 2019, Journal of Strategic Studies.

[23]  USE OF AI IN ONLINE CONTENT MODERATION , 2019 .

[24]  L. Floridi,et al.  Regulate artificial intelligence to avert cyber arms race , 2018, Nature.

[25]  Hyrum S. Anderson,et al.  The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation , 2018, ArXiv.

[26]  John Seymour,et al.  Generative Models for Spear Phishing Posts on Social Media , 2018, ArXiv.

[27]  Mark Zuckerberg,et al.  MZ shares a note - A Blueprint for Content Governance and Enforcement , 2018 .

[28]  David H. Reiley,et al.  The Economics of Spam , 2012 .

[29]  Robert Powell,et al.  Nuclear Deterrence Theory, Nuclear Proliferation, and National Missile Defense , 2003, International Security.

[30]  Anele Nwokoma,et al.  Process Evaluation of the Computer Fraud and Abuse Act of 1986 , 2000 .

[31]  Léo-Paul Dana The People's Republic of China (PRC) , 1999 .