Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
暂无分享,去创建一个
[1] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[2] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[3] Alborz Geramifard,et al. RLPy: a value-function-based reinforcement learning framework for education and research , 2015, J. Mach. Learn. Res..
[4] Adam Krzyzak,et al. A Distribution-Free Theory of Nonparametric Regression , 2002, Springer series in statistics.
[5] Csaba Szepesvari,et al. Regularization in reinforcement learning , 2011 .
[6] J. Peterson. On-Line Estimation of the Optimal Value Function: HJB- Estimators , 1992, NIPS 1992.
[7] A. Müller. Integral Probability Metrics and Their Generating Classes of Functions , 1997, Advances in Applied Probability.
[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[9] Yann Ollivier,et al. Making Deep Q-learning methods robust to time discretization , 2019, ICML.
[10] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[11] Daniel Nikovski,et al. Truncated Approximate Dynamic Programming with Task-Dependent Terminal Value , 2016, AAAI.
[12] Jan Peters,et al. Learning Motor Skills - From Algorithms to Robot Experiments , 2013, Springer Tracts in Advanced Robotics.
[13] Dimitri P. Bertsekas,et al. Dynamic programming and optimal control, 3rd Edition , 2005 .
[14] Csaba Szepesvári,et al. Model Selection in Reinforcement Learning , 2011, Machine Learning.
[15] Nan Jiang,et al. The Dependence of Effective Planning Horizon on Model Accuracy , 2015, AAMAS.
[16] C. Villani. Optimal Transport: Old and New , 2008 .
[17] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[18] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[19] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[20] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[21] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[22] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[23] Marcello Restelli,et al. Configurable Markov Decision Processes , 2018, ICML.
[24] Marek Petrik,et al. Biasing Approximate Dynamic Programming with a Lower Discount Factor , 2008, NIPS.
[25] M. Baykal-Gursoy,et al. SEMI-MARKOV DECISION PROCESSES , 2022 .
[26] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[27] L. Györfi,et al. A Distribution-Free Theory of Nonparametric Regression (Springer Series in Statistics) , 2002 .
[28] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[29] Balaraman Ravindran,et al. Dynamic Action Repetition for Deep Reinforcement Learning , 2017, AAAI.
[30] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .
[31] Yang Hu,et al. Reinforcement learning for robotic manipulation using simulated locomotion demonstrations , 2019, Machine Learning.
[32] Marcello Restelli,et al. Reinforcement Learning in Configurable Continuous Environments , 2019, ICML.
[33] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[34] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[35] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[36] Michail G. Lagoudakis,et al. On the locality of action domination in sequential decision making , 2010, ISAIM.
[37] C. Karen Liu,et al. Assistive Gym: A Physics Simulation Framework for Assistive Robotics , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).
[38] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[39] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[40] Satinder P. Singh,et al. Scaling Reinforcement Learning Algorithms by Learning Variable Temporal Resolution Models , 1992, ML.
[41] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[42] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[43] Luca Bascetta,et al. Policy gradient in Lipschitz Markov Decision Processes , 2015, Machine Learning.
[44] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[45] K. Gürsoy,et al. SEMI-MARKOV DECISION PROCESSES , 2007, Probability in the Engineering and Informational Sciences.
[46] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[47] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[48] Rémi Munos,et al. A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method , 1997, IJCAI.
[49] Peter Dayan,et al. Improving Policies without Measuring Merits , 1995, NIPS.
[50] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[51] Csaba Szepesvári,et al. Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path , 2006, Machine Learning.
[52] F. Fairman. Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.
[53] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[54] Rémi Munos,et al. Reinforcement Learning for Continuous Stochastic Control Problems , 1997, NIPS.
[55] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[56] Shie Mannor,et al. Approximate Value Iteration with Temporally Extended Actions , 2015, J. Artif. Intell. Res..
[57] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.
[58] Kenji Doya,et al. Reinforcement Learning in Continuous Time and Space , 2000, Neural Computation.
[59] Pierre Geurts,et al. Extremely randomized trees , 2006, Machine Learning.