Reinforcement Learning as a Framework for Ethical Decision Making

Emerging AI systems will be making more and more decisions that impact the lives of humans in a significant way. It is essential, then, that these AI systems make decisions that take into account the desires, goals, and preferences of other people, while simultaneously learning about what those preferences are. In this work, we argue that the reinforcement-learning framework achieves the appropriate generality required to theorize about an idealized ethical artificial agent, and offers the proper foundations for grounding specific questions about ethical learning and decision making that can promote further scientific investigation. We define an idealized formalism for an ethical learner, and conduct experiments on two toy ethical dilemmas, demonstrating the soundness and flexibility of our approach. Lastly, we identify several critical challenges for future advancement in the area that can leverage our proposed framework.

[1]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[2]  Nick Bostrom,et al.  Superintelligence: Paths, Dangers, Strategies , 2014 .

[3]  David L. Roberts,et al.  A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback , 2014, AAAI.

[4]  Anne Condon,et al.  On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[5]  Daniel Dewey,et al.  Learning What to Value , 2011, AGI.

[6]  Selmer Bringsjord,et al.  Toward a General Logicist Methodology for Engineering Ethically Correct Robots , 2006, IEEE Intelligent Systems.

[7]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[8]  D. Clarke The logical form of imperatives , 1975 .

[9]  Matthias Scheutz,et al.  "Sorry, I Can't Do That": Developing Mechanisms to Appropriately Reject Directives in Human-Robot Interactions , 2015, AAAI Fall Symposia.

[10]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[11]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[12]  J. Horty Agency and Deontic Logic , 2001 .

[13]  Marcello Guarini,et al.  Particularism and the Classification and Reclassification of Moral Cases , 2006, IEEE Intelligent Systems.

[14]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[15]  T. Beauchamp,et al.  Principles of biomedical ethics , 1991 .

[16]  D. B. Davis,et al.  Intel Corp. , 1993 .

[17]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[18]  Konstantine Arkoudas,et al.  Toward Ethical Robots via Mechanized Deontic Logic∗ , 2005 .

[19]  Smaranda Muresan,et al.  Grounding English Commands to Reward Functions , 2015, Robotics: Science and Systems.

[20]  Michael Anderson,et al.  MedEthEx: A Prototype Medical Ethics Advisor , 2006, AAAI.

[21]  Eric Allender,et al.  Complexity of finite-horizon Markov decision process problems , 2000, JACM.

[22]  Lihong Li,et al.  Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..

[23]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[24]  Michael Kearns,et al.  Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.

[25]  Michael L. Littman,et al.  Apprenticeship Learning About Multiple Intentions , 2011, ICML.

[26]  Yuko Murakami,et al.  Utilitarian Deontic Logic , 2004, Advances in Modal Logic.

[27]  Stuart Armstrong,et al.  Motivated Value Selection for Artificial Agents , 2015, AAAI Workshop: AI and Ethics.

[28]  Sham M. Kakade,et al.  On the sample complexity of reinforcement learning. , 2003 .

[29]  Michael L. Littman,et al.  Between Imitation and Intention Learning , 2015, IJCAI.