Learning model-free robot control by a Monte Carlo EM algorithm
暂无分享,去创建一个
Marc Toussaint | Nikos Vlassis | Georgios Kontes | Savas Piperidis | Marc Toussaint | N. Vlassis | G. Kontes | S. Piperidis | Savas Piperidis
[1] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[2] G. C. Wei,et al. A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .
[3] Radford M. Neal. Monte Carlo Implementation , 1996 .
[4] Andrew P. Sage,et al. Uncertainty in Artificial Intelligence , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[5] M. Goodman. Learning to Walk: The Origins of the UK's Joint Intelligence Committee , 2008 .
[6] Michael I. Jordan. Learning in Graphical Models , 1999, NATO ASI Series.
[7] H. Sebastian Seung,et al. Learning to Walk in 20 Minutes , 2005 .
[8] Yoon Keun Kwak,et al. Dynamic Analysis of a Nonholonomic Two-Wheeled Inverted Pendulum Robot , 2005, J. Intell. Robotic Syst..
[9] Nando de Freitas,et al. A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot , 2009, Auton. Robots.
[10] Marc Toussaint,et al. Model-free reinforcement learning as mixture learning , 2009, ICML '09.
[11] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[12] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[13] Nando de Freitas,et al. Bayesian Policy Learning with Trans-Dimensional MCMC , 2007, NIPS.
[14] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[15] Pat Langley,et al. Editorial: On Machine Learning , 1986, Machine Learning.
[16] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[18] Gregory F. Cooper,et al. A Method for Using Belief Networks as Influence Diagrams , 2013, UAI 1988.
[19] Geoffrey E. Hinton,et al. A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.
[20] Jan Peters,et al. Policy Search for Motor Primitives , 2009, Künstliche Intell..
[21] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[22] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[23] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[24] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[25] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[26] Marc Toussaint,et al. Probabilistic inference for solving (PO) MDPs , 2006 .
[27] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[28] Jan Peters,et al. Using reward-weighted imitation for robot Reinforcement Learning , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.