暂无分享,去创建一个
[1] Stephen J. Wright,et al. Numerical Optimization , 2018, Fundamental Statistical Inference.
[2] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[3] Michail G. Lagoudakis,et al. Binary action search for learning continuous-action control policies , 2009, ICML '09.
[4] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[5] Wei Chu,et al. Support Vector Ordinal Regression , 2007, Neural Computation.
[6] Jan Koutník,et al. Reinforcement Learning to Run… Fast , 2018 .
[7] Emanuel Todorov,et al. General duality between optimal control and estimation , 2008, 2008 47th IEEE Conference on Decision and Control.
[8] Pascal Vincent,et al. Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis , 2018, NeurIPS.
[9] Roger B. Grosse,et al. Optimizing Neural Networks with Kronecker-factored Approximate Curvature , 2015, ICML.
[10] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[11] Yunhao Tang,et al. Implicit Policy for Reinforcement Learning , 2018, ArXiv.
[12] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.
[13] Richard Evans,et al. Deep Reinforcement Learning in Large Discrete Action Spaces , 2015, 1512.07679.
[14] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[15] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[16] Navdeep Jaitly,et al. Discrete Sequential Prediction of Continuous Actions for Deep RL , 2017, ArXiv.
[17] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[18] D K Smith,et al. Numerical Optimization , 2001, J. Oper. Res. Soc..
[19] Mohammad Emtiyaz Khan,et al. A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models , 2012, AISTATS.
[20] Sebastian Scherer,et al. Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution , 2017, ICML.
[21] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] Wei Chu,et al. Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..
[24] Shakir Mohamed,et al. Variational Inference with Normalizing Flows , 2015, ICML.
[25] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[26] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..
[27] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[28] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[29] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[30] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[31] Arash Tavakoli,et al. Action Branching Architectures for Deep Reinforcement Learning , 2017, AAAI.
[32] Gianluca Pollastri,et al. A neural network approach to ordinal regression , 2007, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[33] Christopher Winship,et al. REGRESSION MODELS WITH ORDINAL VARIABLES , 1984 .
[34] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[35] Marco Wiering,et al. Using continuous action spaces to solve discrete problems , 2009, 2009 International Joint Conference on Neural Networks.
[36] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.