暂无分享,去创建一个
Philip S. Thomas | Georgios Theocharous | James Kostas | Yash Chandak | Scott M. Jordan | James E. Kostas | P. Thomas | Georgios Theocharous | Yash Chandak | James Kostas
[1] Vivek S. Borkar,et al. The actor-critic algorithm as multi-time-scale stochastic approximation , 1997 .
[2] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[3] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[4] E Bizzi,et al. Motor learning through the combination of primitives. , 2000, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[5] Zoubin Ghahramani,et al. An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..
[6] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[7] Guy Shani,et al. An MDP-Based Recommender System , 2002, J. Mach. Learn. Res..
[8] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[9] Scotty D. Craig,et al. Integrating Affect Sensors in an Intelligent Tutoring System , 2004 .
[10] Geoffrey E. Hinton,et al. Reinforcement Learning with Factored States and Actions , 2004, J. Mach. Learn. Res..
[11] M. Lemay,et al. Modularity of motor output evoked by intraspinal microstimulation in cats. , 2004, Journal of neurophysiology.
[12] J. Jing,et al. The Construction of Movement with Behavior-Specific and Behavior-Independent Modules , 2004, The Journal of Neuroscience.
[13] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[14] S. Schaal. Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .
[15] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[16] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[17] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[18] Marco Wiering,et al. Using continuous action spaces to solve discrete problems , 2009, 2009 International Joint Conference on Neural Networks.
[19] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[20] Philip S. Thomas,et al. Policy Gradient Coagent Networks , 2011, NIPS.
[21] Jason Pazis,et al. Generalized Value Functions for Large Action Sets , 2011, ICML.
[22] Andrew G. Barto,et al. Conjugate Markov Decision Processes , 2011, ICML.
[23] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[24] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[25] Andrew G. Barto,et al. Motor primitive discovery , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[26] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[27] Philip Thomas,et al. Bias in Natural Actor-Critic Algorithms , 2014, ICML.
[28] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[29] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[30] Philip S. Thomas,et al. Ad Recommendation Systems for Life-Time Value Optimization , 2015, WWW.
[31] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[32] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[33] Cordelia Schmid,et al. Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[34] Roni Khardon,et al. Online Symbolic Gradient-Based Optimization for Factored Action MDPs , 2016, IJCAI.
[35] René Vidal,et al. Global Optimality in Neural Network Training , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[37] Balaraman Ravindran,et al. Learning to Factor Policies and Action-Value Functions: Factored Action Space Representations for Deep Reinforcement learning , 2017, ArXiv.
[38] Damien Ernst,et al. Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives , 2017 .
[39] Zhengyao Jiang,et al. A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem , 2017, ArXiv.
[40] Trevor Darrell,et al. Loss is its own Reward: Self-Supervision for Reinforcement Learning , 2016, ICLR.
[41] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[42] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[43] Roni Khardon,et al. Lifted Stochastic Planning, Belief Propagation and Marginal MAP , 2018, AAAI Workshops.
[44] Yonina C. Eldar,et al. The Global Optimization Geometry of Shallow Linear Neural Networks , 2018, Journal of Mathematical Imaging and Vision.
[45] Shie Mannor,et al. The Natural Language of Actions , 2019, ICML.
[46] Joelle Pineau,et al. Combined Reinforcement Learning via Abstract Representations , 2018, AAAI.