Monte-Carlo Expectation Maximization for Decentralized POMDPs
暂无分享,去创建一个
[1] George A. Bekey,et al. On autonomous robots , 1998, The Knowledge Engineering Review.
[2] Ronald C. Neath,et al. On Convergence Properties of the Monte Carlo EM Algorithm , 2012, 1206.4768.
[3] Marc Toussaint,et al. Scalable Multiagent Planning Using Probabilistic Inference , 2011, IJCAI.
[4] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[5] Xinhua Zhang,et al. Conditional random fields for multi-agent reinforcement learning , 2007, ICML '07.
[6] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Jaakko Peltonen,et al. Periodic Finite State Controllers for Efficient POMDP and DEC-POMDP Planning , 2011, NIPS.
[9] Bikramjit Banerjee,et al. Sample Bounded Distributed Reinforcement Learning for Decentralized POMDPs , 2012, AAAI.
[10] Robert M Thrall,et al. Mathematics of Operations Research. , 1978 .
[11] Marc Toussaint,et al. Hierarchical POMDP Controller Optimization by Likelihood Maximization , 2008, UAI.
[12] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[13] Shlomo Zilberstein,et al. Memory-Bounded Dynamic Programming for DEC-POMDPs , 2007, IJCAI.
[14] Makoto Yokoo,et al. Taming Decentralized POMDPs: Towards Efficient Policy Computation for Multiagent Settings , 2003, IJCAI.
[15] Shlomo Zilberstein,et al. Policy Iteration for Decentralized Control of Markov Decision Processes , 2009, J. Artif. Intell. Res..
[16] Feng Wu,et al. Rollout Sampling Policy Iteration for Decentralized POMDPs , 2010, UAI.
[17] Shlomo Zilberstein,et al. Dynamic Programming for Partially Observable Stochastic Games , 2004, AAAI.
[18] François Charpillet,et al. Mixed Integer Linear Programming for Exact Finite-Horizon Planning in Decentralized Pomdps , 2007, ICAPS.
[19] François Charpillet,et al. MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs , 2005, UAI.
[20] Makoto Yokoo,et al. Networked Distributed POMDPs: A Synergy of Distributed Constraint Optimization and POMDPs , 2005, IJCAI.
[21] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[22] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[23] G. C. Wei,et al. A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .
[24] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[25] Peter Kulchyski. and , 2015 .
[26] Marc Toussaint,et al. Model-free reinforcement learning as mixture learning , 2009, ICML '09.
[27] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[28] Frans A. Oliehoek,et al. Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion , 2011, IJCAI.
[29] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[30] Jaakko Peltonen,et al. Efficient Planning for Factored Infinite-Horizon DEC-POMDPs , 2011, IJCAI.
[31] Brahim Chaib-draa,et al. Toward error-bounded algorithms for infinite-horizon DEC-POMDPs , 2011, AAMAS.
[32] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.
[33] Shlomo Zilberstein,et al. Anytime Planning for Decentralized POMDPs using Expectation Maximization , 2010, UAI.
[34] Victor R. Lesser,et al. Coordinated Multi-Agent Reinforcement Learning in Networked Distributed POMDPs , 2011, AAAI.