论文信息 - Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning

This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in single agent systems as well as multiagent systems and multirobot systems. We prove that if an MDP possesses a symmetry, then the optimal value function andQ function are similarly symmetric and there exists a symmetric optimal policy. If an MDP is known to possess a symmetry, this knowledge can be applied to decrease the number of training examples needed for algorithms like Q learning and value iteration. It can also be used to directly restrict the hypothesis space.

Tucker R. Balch | Martin Zinkevich | Martin A. Zinkevich | T. Balch

[1] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[2] R. Varga,et al. Proof of Theorem 2 , 1983 .

[3] Kim C. Border,et al. Fixed point theorems with applications to economics and game theory: Fixed point theorems for correspondences , 1985 .

[4] C. Watkins. Learning from delayed rewards , 1989 .

[5] Lynne E. Parker,et al. Adaptive action selection for cooperative agent teams , 1993 .

[6] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[7] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .

[8] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[9] J. Filar,et al. Competitive Markov Decision Processes , 1996 .

[10] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[11] M. Veloso,et al. Bounding the suboptimality of reusing subproblems , 1999, IJCAI 1999.

[12] Tucker R. Balch,et al. Hierarchic Social Entropy: An Information Theoretic Measure of Robot Group Diversity , 2000, Auton. Robots.