Reinforcement Learning in Continuous Time and Space
暂无分享,去创建一个
[1] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[2] J J Hopfield,et al. Neurons with graded response have collective computational properties like those of two-state neurons. , 1984, Proceedings of the National Academy of Sciences of the United States of America.
[3] C. Watkins. Learning from delayed rewards , 1989 .
[4] Vijaykumar Gullapalli,et al. A stochastic reinforcement learning algorithm for learning real-valued functions , 1990, Neural Networks.
[5] Steven J. Bradtke,et al. Reinforcement Learning Applied to Linear Quadratic Regulation , 1992, NIPS.
[6] J. Peterson. On-Line Estimation of the Optimal Value Function: HJB- Estimators , 1992, NIPS 1992.
[7] W. Fleming,et al. Controlled Markov processes and viscosity solutions , 1992 .
[8] James K. Peterson,et al. On Line Estimation of Optimal Control Sequences: HJB Estimators , 1992, NIPS.
[9] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[10] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[11] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[12] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[13] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[14] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[15] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[16] Kenji Doya,et al. Temporal Difference Learning in Continuous Time and Space , 1995, NIPS.
[17] Richard S. Sutton,et al. A Menu of Designs for Reinforcement Learning Over Time , 1995 .
[18] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[19] Peter Dayan,et al. Improving Policies without Measuring Merits , 1995, NIPS.
[20] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.
[21] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[22] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[23] A. Harry Klopf,et al. Reinforcement Learning Applied to a Differential Game , 1995, Adapt. Behav..
[24] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[25] Geoffrey J. Gordon. Stable Fitted Reinforcement Learning , 1995, NIPS.
[26] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[27] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[28] Kenji Doya,et al. Efficient Nonlinear Control with Actor-Tutor Architecture , 1996, NIPS.
[29] Minoru Asada,et al. Action-based sensor space categorization for robot learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.
[30] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[31] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[32] Rémi Munos,et al. Reinforcement Learning for Continuous Stochastic Control Problems , 1997, NIPS.
[33] Stephan Pareigis,et al. Adaptive Choice of Grid and Time in Reinforcement Learning , 1997, NIPS.
[34] Rémi Munos,et al. A Convergent Reinforcement Learning Algorithm in the Continuous Case Based on a Finite Difference Method , 1997, IJCAI.
[35] Jun Morimoto,et al. Conference on Intelligent Robots and Systems Reinforcement Le,arning of Dynamic Motor Sequence: Learning to Stand Up , 2022 .
[36] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[37] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[38] M. K. Ali,et al. SELF-ADAPTING REACTIVE AUTONOMOUS AGENTS , 2000 .
[39] Rajesh P. N. Rao,et al. Spike-Timing-Dependent Hebbian Plasticity as Temporal Difference Learning , 2001, Neural Computation.
[40] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[41] Kenji Doya,et al. Neural mechanisms of learning and control , 2001 .
[42] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.