Scaling Expectation-Maximization for Inverse Reinforcement Learning to Multiple Robots under Occlusion
暂无分享,去创建一个
[1] F. A. Muckler,et al. On the inverse optimal control problem in manual control systems , 1965 .
[2] Prashant Doshi,et al. Multi-robot inverse reinforcement learning under occlusion with interactions , 2014, AAMAS.
[3] Emanuel Todorov,et al. Inverse Optimal Control with Linearly-Solvable MDPs , 2010, ICML.
[4] E. T. Jaynes,et al. Where do we Stand on Maximum Entropy , 1979 .
[5] Uffe Kjærulff,et al. Blocking Gibbs sampling in very large probabilistic expert systems , 1995, Int. J. Hum. Comput. Stud..
[6] Andrew Y. Ng,et al. Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.
[7] Kian Hsiang Low,et al. Inverse Reinforcement Learning with Locally Consistent Reward Functions , 2015, NIPS.
[8] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[9] Ryo Kurazume,et al. Robust motion capture system against target occlusion using fast level set method , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..
[10] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] Craig Boutilier,et al. Planning, Learning and Coordination in Multiagent Decision Processes , 1996, TARK.
[13] Dale Schuurmans,et al. The latent maximum entropy principle , 2002, Proceedings IEEE International Symposium on Information Theory,.
[14] Prashant Doshi,et al. Toward Estimating Others' Transition Models Under Occlusion for Multi-Robot IRL , 2015, IJCAI.
[15] Christian P. Robert,et al. Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.
[16] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[17] Avi Pfeffer,et al. Discovering Weakly-Interacting Factors in a Complex Stochastic Process , 2007, NIPS.
[18] Anind K. Dey,et al. Modeling Interaction via the Principle of Maximum Causal Entropy , 2010, ICML.
[19] Oliver Kroemer,et al. Structured Apprenticeship Learning , 2012, ECML/PKDD.
[20] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..
[21] Stuart J. Russell. Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.
[22] Bodo Rosenhahn,et al. Dealing with Self-occlusion in Region Based Motion Capture by Means of Internal Regions , 2008, AMDO.
[23] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[24] Percy Liang,et al. Adaptivity and Optimism: An Improved Exponentiated Gradient Algorithm , 2014, ICML.
[25] Christian P. Robert,et al. Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .
[26] Dana Kulic,et al. Expectation-Maximization for Inverse Reinforcement Learning with Hidden Data , 2016, AAMAS.
[27] P. Tseng. Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .
[28] Matthieu Geist,et al. Inverse Reinforcement Learning through Structured Classification , 2012, NIPS.
[29] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.