Benefits of Assistance over Reward Learning

Much recent work has focused on how an agent can learn what to do from human feedback, leading to two major paradigms. The first paradigm is reward learning, in which the agent learns a reward model through human feedback that is provided externally from the environment. The second is assistance, in which the human is modeled as a part of the environment, and the true reward function is modeled as a latent variable in the environment that the agent may make inferences about. The key difference between the two paradigms is that in the reward learning paradigm, by construction there is a separation between reward learning and control using the learned reward. In contrast, in assistance these functions are performed as needed by a single policy. By merging reward learning and control, assistive agents can reason about the impact of control actions on reward learning, leading to several advantages over agents based on reward learning. We illustrate these advantages in simple environments by showing desirable qualitative behaviors of assistive agents that cannot be found by agents based on reward learning.

[1]  P. Randolph Bayesian Decision Problems and Markov Chains , 1968 .

[2]  M. Allais,et al.  The So-Called Allais Paradox and Rational Decisions under Uncertainty , 1979 .

[3]  John McCarthy,et al.  SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .

[4]  John A. List,et al.  Preference Learning in Consecutive Experimental Auctions , 2000 .

[5]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[6]  Andrew G. Barto,et al.  Optimal learning: computational procedures for bayes-adaptive markov decision processes , 2002 .

[7]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[8]  Nando de Freitas,et al.  Active Preference Learning with Discrete Choice Data , 2007, NIPS.

[9]  Sriraam Natarajan,et al.  A Decision-Theoretic Model of Assistance , 2007, IJCAI.

[10]  Stephen M. Omohundro,et al.  The Basic AI Drives , 2008, AGI.

[11]  Emilio Frazzoli,et al.  Intention-Aware Motion Planning , 2013, WAFR.

[12]  Leslie Pack Kaelbling,et al.  POMCoP: Belief Space Planning for Sidekicks in Cooperative Games , 2012, AIIDE.

[13]  Joseph Y. Halpern,et al.  Game theory with translucent players , 2013, TARK.

[14]  Alex Graves,et al.  Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.

[15]  Stefanos Nikolaidis,et al.  Human-robot cross-training: Computational formulation, modeling and evaluation of a human team training strategy , 2013, 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[16]  Oliver Kroemer,et al.  Active Reward Learning , 2014, Robotics: Science and Systems.

[17]  Siddhartha S. Srinivasa,et al.  Shared Autonomy via Hindsight Optimization , 2015, Robotics: Science and Systems.

[18]  Anca D. Dragan,et al.  Cooperative Inverse Reinforcement Learning , 2016, NIPS.

[19]  R. Cohn Maximizing Expected Value of Information in Decision Problems by Querying on a Wish-to-Know Basis. , 2016 .

[20]  John Schulman,et al.  Concrete Problems in AI Safety , 2016, ArXiv.

[21]  Michael C. Frank,et al.  Review Pragmatic Language Interpretation as Probabilistic Inference , 2022 .

[22]  Fiery Cushman,et al.  Showing versus doing: Teaching by demonstration , 2016, NIPS.

[23]  Alexandra Chouldechova,et al.  Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , 2016, Big Data.

[24]  Nishant Desai Uncertain Reward-Transition MDPs for Negotiable Reinforcement Learning , 2017 .

[25]  Avi Feller,et al.  Algorithmic Decision Making and the Cost of Fairness , 2017, KDD.

[26]  Anca D. Dragan,et al.  Inverse Reward Design , 2017, NIPS.

[27]  Shane Legg,et al.  Deep Reinforcement Learning from Human Preferences , 2017, NIPS.

[28]  Christos Dimitrakakis,et al.  Multi-View Decision Processes: The Helper-AI Problem , 2017, NIPS.

[29]  Anca D. Dragan,et al.  The Off-Switch Game , 2016, IJCAI.

[30]  Johannes Fürnkranz,et al.  A Survey of Preference-Based Reinforcement Learning Methods , 2017, J. Mach. Learn. Res..

[31]  Matthias Grossglauser,et al.  Just Sort It! A Simple and Effective Approach to Active Preference Learning , 2015, ICML.

[32]  Anca D. Dragan,et al.  Learning Robot Objectives from Physical Human Interaction , 2017, CoRL.

[33]  Edmund H. Durfee,et al.  Approximately-Optimal Queries for Planning in Reward-Uncertain Markov Decision Processes , 2017, ICAPS.

[34]  C. Robert Superintelligence: Paths, Dangers, Strategies , 2017 .

[35]  Anca D. Dragan,et al.  Active Preference-Based Learning of Reward Functions , 2017, Robotics: Science and Systems.

[36]  Jon M. Kleinberg,et al.  Inherent Trade-Offs in the Fair Determination of Risk Scores , 2016, ITCS.

[37]  Sergey Levine,et al.  Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.

[38]  Dylan Hadfield-Menell,et al.  Active Inverse Reward Design , 2018, ArXiv.

[39]  Ryan Carey,et al.  Incorrigibility in the CIRL Framework , 2017, AIES.

[40]  Anca D. Dragan,et al.  An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning , 2018, ICML.

[41]  J. Clune,et al.  The Surprising Creativity of Digital Evolution , 2018, ALIFE.

[42]  Stuart Armstrong,et al.  Occam's razor is insufficient to infer the preferences of irrational agents , 2017, NeurIPS.

[43]  Alexander Matt Turner Optimal Farsighted Agents Tend to Seek Power , 2019, ArXiv.

[44]  Stuart Russell Human Compatible: Artificial Intelligence and the Problem of Control , 2019 .

[45]  Sergey Levine,et al.  From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following , 2019, ICLR.

[46]  Daniel Szafir,et al.  Balanced Information Gathering and Goal-Oriented Actions in Shared Autonomy , 2019, 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[47]  Dorsa Sadigh,et al.  Asking Easy Questions: A User-Friendly Approach to Active Reward Learning , 2019, CoRL.

[48]  Anca D. Dragan,et al.  Preferences Implicit in the State of the World , 2018, ICLR.

[49]  Anca D. Dragan,et al.  On the Utility of Learning about Humans for Human-AI Coordination , 2019, NeurIPS.

[50]  Hong Jun Jeon,et al.  Reward-rational (implicit) choice: A unifying formalism for reward learning , 2020, NeurIPS.

[51]  Karol Hausman,et al.  Learning to Interactively Learn and Assist , 2019, AAAI.

[52]  D. Kulić,et al.  Active Preference Learning using Maximum Regret , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[53]  Dylan Hadfield-Menell,et al.  Conservative Agency via Attainable Utility Preservation , 2019, AIES.

[54]  Laurent Orseau,et al.  Pitfalls of learning a reward function online , 2020, IJCAI.