Commitment Semantics for Sequential Decision Making under Reward Uncertainty

Cooperating agents can make commitments to help each other, but commitments might have to be probabilistic when actions have stochastic outcomes. We consider the additional complication in cases where an agent might prefer to change its policy as it learns more about its reward function from experience. How should such an agent be allowed to change its policy while still faithfully pursuing its commitment in a principled decision-theoretic manner? We address this question by defining a class of Dec-POMDPs with Bayesian reward uncertainty, and by developing a novel Commitment Constrained Iterative Mean Reward algorithm that implements the semantics of faithful commitment pursuit while still permitting the agent's response to the evolving understanding of its rewards. We bound the performance of our algorithm theoretically, and evaluate empirically how it effectively balances solution quality and computation cost.

[1]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[2]  H. Raiffa The art and science of negotiation , 1983 .

[3]  H. Raiffa,et al.  The art and science of negotiation , 1983 .

[4]  Hector J. Levesque,et al.  Intention is Choice with Commitment , 1990, Artif. Intell..

[5]  Michael P. Georgeff,et al.  Commitment and Effectiveness of Situated Agents , 1991, IJCAI.

[6]  Nicholas R. Jennings,et al.  Commitments and conventions: The foundation of coordination in multi-agent systems , 1993, The Knowledge Engineering Review.

[7]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[8]  Cristiano Castelfranchi,et al.  Commitments: From Individual Intentions to Groups and Organizations , 1995, ICMAS.

[9]  Michael Wooldridge,et al.  Autonomous agents and multi-agent systems , 2014 .

[10]  Lauren Wood 技術解説 IEEE Internet Computing , 1999 .

[11]  Victor R. Lesser,et al.  Incorporating Uncertainty in Agent Commitments , 1999, ATAL.

[12]  Nicholas R. Jennings,et al.  Intelligent agents VI : agent theories, architectures, and languages : 6th International Workshop, ATAL '99, Orlando, Florida, USA, July 15-17, 1999 : proceedings , 2000 .

[13]  Jie Xing,et al.  Formalization of commitment-based agent interaction , 2001, SAC.

[14]  Victor R. Lesser,et al.  Leveled Commitment Contracts and Strategic Breach , 2001, Games Econ. Behav..

[15]  Michael N. Huhns,et al.  Commitments Among Agents , 2003, IEEE Internet Comput..

[16]  Reid G. Simmons,et al.  Heuristic Search Value Iteration for POMDPs , 2004, UAI.

[17]  Munindar P. Singh An ontology for commitments in multiagent systems: , 1999, Artificial Intelligence and Law.

[18]  Joseph Y. Halpern,et al.  Proceedings of the 20th conference on Uncertainty in artificial intelligence , 2004, UAI 2004.

[19]  Victor R. Lesser,et al.  Decentralized Markov decision processes with event-driven interactions , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[20]  Edmund H. Durfee,et al.  Stationary Deterministic Policies for Constrained MDPs with Multiple Rewards, Costs, and Discount Factors , 2005, IJCAI.

[21]  Michael Winikoff,et al.  Implementing flexible and robust agent interactions using Distributed Commitment Machines , 2006, Multiagent Grid Syst..

[22]  Jesse Hoey,et al.  An analytic solution to discrete Bayesian reinforcement learning , 2006, ICML.

[23]  S. Cragg Costs , 2008, The Employment Tribunals Handbook: Practice, Procedure and Strategies for Success.

[24]  Wojciech Jamroga,et al.  Strategic commitment and release in logics for multi-agent systems , 2008 .

[25]  William W. Cohen,et al.  Proceedings of the 23rd international conference on Machine learning , 2006, ICML 2008.

[26]  Milind Tambe,et al.  Exploiting Coordination Locales in Distributed POMDPs via Social Model Shaping , 2009, ICAPS.

[27]  J. Vokrínek,et al.  Decommitting in multi-agent execution in non-deterministic environment: experimental approach , 2009, AAMAS.

[28]  Jacques L. Koko,et al.  The Art and Science of Negotiation , 2009 .

[29]  Edmund H. Durfee,et al.  Influence-Based Policy Abstraction for Weakly-Coupled Dec-POMDPs , 2010, ICAPS.

[30]  Oriol Carbonell-Nicolau Games and Economic Behavior , 2011 .

[31]  Paola Mello,et al.  Representing and monitoring social commitments using the event calculus , 2013, Autonomous Agents and Multi-Agent Systems.

[32]  Jamal Bentahar,et al.  On the interaction between knowledge and social commitments in multi-agent systems , 2014, Applied Intelligence.