Multi-Objective Reinforcement Learning for Designing Ethical Environments

AI research is being challenged with ensuring that autonomous agents learn to behave ethically, namely in alignment with moral values. A common approach, founded on the exploitation of Reinforcement Learning techniques, is to design environments that incentivise agents to behave ethically. However, to the best of our knowledge, current approaches do not theoretically guarantee that an agent will learn to behave ethically. Here, we make headway along this direction by proposing a novel way of designing environments wherein it is formally guaranteed that an agent learns to behave ethically while pursuing its individual objective. Our theoretical results develop within the formal framework of Multi-Objective Reinforcement Learning to ease the handling of an agent’s individual and ethical objectives. As a further contribution, we leverage on our theoretical results to introduce an algorithm that automates the design of ethical environments.

[1]  Juan A. Rodriguez-Aguilar,et al.  A Structural Solution to Sequential Moral Dilemmas , 2020, AAMAS.

[2]  T. Horgan,et al.  UNTYING A KNOT FROM THE INSIDE OUT: REFLECTIONS ON THE “PARADOX” OF SUPEREROGATION* , 2010, Social Philosophy and Policy.

[3]  Kush R. Varshney,et al.  Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration , 2019, IJCAI.

[4]  Thomas G. Dietterich Adaptive computation and machine learning , 1998 .

[5]  이종복,et al.  20 , 1995, Magical Realism for Non-Believers.

[6]  Stuart J. Russell,et al.  Research Priorities for Robust and Beneficial Artificial Intelligence , 2015, AI Mag..

[7]  Matthias Scheutz,et al.  Value Alignment or Misalignment - What Will Keep Systems Accountable? , 2017, AAAI Workshops.

[8]  Kenneth L. Clarkson,et al.  Applications of random sampling in computational geometry, II , 1988, SCG '88.

[9]  Michael L. Littman,et al.  Reinforcement Learning as a Framework for Ethical Decision Making , 2016, AAAI Workshop: AI, Ethics, and Society.

[10]  R. M. Chisholm,et al.  Supererogation and Offence: A Conceptual Scheme for Ethics , 1963 .

[11]  B. Hodson,et al.  The effect of passage in vitro and in vivo on the properties of murine fibrosarcomas. II. Sensitivity to cell-mediated cytotoxicity in vitro. , 1985, British Journal of Cancer.

[12]  Lucy Rosenbloom arXiv , 2019, The Charleston Advisor.

[13]  Laurent Orseau,et al.  AI Safety Gridworlds , 2017, ArXiv.

[14]  Oren Etzioni,et al.  Designing AI systems that obey our laws and values , 2016, Commun. ACM.

[15]  E. Hundert “Ought” Implies “Can” , 1994, Harvard review of psychiatry.

[16]  Benja Fallenstein,et al.  Aligning Superintelligence with Human Interests: A Technical Research Agenda , 2015 .

[17]  N. Biller-Andorno Care, Ethics of , 2012 .

[18]  Peter A. Flach,et al.  Proceedings of the 28th International Conference on Machine Learning , 2011 .