暂无分享,去创建一个
Yunbo Wang | Fei-Fei Li | Yuke Zhu | Jiajun Wu | Joshua B. Tenenbaum | Bo Liu | Simon S. Du | Li Fei-Fei | J. Tenenbaum | S. Du | Yuke Zhu | Jiajun Wu | Yunbo Wang | Bo Liu
[1] Surya P. N. Singh,et al. An online and approximate solver for POMDPs with continuous action space , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[2] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[3] Shie Mannor,et al. A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..
[4] Marc Toussaint,et al. Probabilistic inference for solving discrete and continuous state Markov Decision Processes , 2006, ICML.
[5] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[6] Sergey Levine,et al. Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review , 2018, ArXiv.
[7] Jonathan P. How,et al. Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[8] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[9] Arnaud Doucet,et al. A survey of convergence results on particle filtering methods for practitioners , 2002, IEEE Trans. Signal Process..
[10] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[11] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[12] Marc Toussaint,et al. Robot trajectory optimization using approximate inference , 2009, ICML '09.
[13] Leslie Pack Kaelbling,et al. Belief space planning assuming maximum likelihood observations , 2010, Robotics: Science and Systems.
[14] Santosha K. Dwivedy,et al. Reinforcement Learning via Recurrent Convolutional Neural Networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[15] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.
[16] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.
[17] Joel Veness,et al. Monte-Carlo Planning in Large POMDPs , 2010, NIPS.
[18] David Hsu,et al. Particle Filter Networks with Application to Visual Localization , 2018, CoRL.
[19] Rémi Munos,et al. Particle Filter-based Policy Gradient in POMDPs , 2008, NIPS.
[20] Oliver Brock,et al. Differentiable Particle Filters: End-to-End Learning with Algorithmic Priors , 2018, Robotics: Science and Systems.
[21] Pieter Abbeel,et al. Value Iteration Networks , 2016, NIPS.
[22] J. Andrew Bagnell,et al. Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .
[23] Yoshua Bengio,et al. Probabilistic Planning with Sequential Monte Carlo methods , 2018, ICLR.
[24] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[25] Yuichi Yoshida,et al. Spectral Normalization for Generative Adversarial Networks , 2018, ICLR.
[26] Pascal Poupart,et al. On Improving Deep Reinforcement Learning for POMDPs , 2017, ArXiv.
[27] Marc Toussaint,et al. An Approximate Inference Approach to Temporal Optimization in Optimal Control , 2010, NIPS.
[28] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[29] Vicenç Gómez,et al. Optimal control as a graphical model inference problem , 2009, Machine Learning.
[30] David Hsu,et al. QMDP-Net: Deep Learning for Planning under Partial Observability , 2017, NIPS.
[31] Nikos A. Vlassis,et al. The Cross-Entropy Method for Policy Search in Decentralized POMDPs , 2008, Informatica.
[32] Shimon Whiteson,et al. Deep Variational Reinforcement Learning for POMDPs , 2018, ICML.
[33] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[34] David Hsu,et al. Particle Filter Networks: End-to-End Probabilistic Localization From Visual Observations , 2018, ArXiv.
[35] David Hsu,et al. Integrating Algorithmic Planning and Deep Learning for Partially Observable Navigation , 2018, ArXiv.
[36] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[37] Mykel J. Kochenderfer,et al. Online Algorithms for POMDPs with Continuous State, Action, and Observation Spaces , 2017, ICAPS.