Tree-Based Batch Mode Reinforcement Learning
暂无分享,去创建一个
[1] R. Bellman,et al. Polynomial approximation—a new computational technique in dynamic programming: Allocation processes , 1962 .
[2] D. Luenberger. Optimization by Vector Space Methods , 1968 .
[3] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[4] C. Atkeson,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[5] Mark W. Spong,et al. Swing up control of the Acrobot , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.
[6] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[7] John Rust. Using Randomization to Break the Curse of Dimensionality , 1997 .
[8] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[9] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[10] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[11] Geoffrey J. Gordon. Online Fitted Reinforcement Learning , 1995 .
[12] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[13] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[14] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[15] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[16] Manuela M. Veloso,et al. Tree Based Discretization for Continuous State Space Reinforcement Learning , 1998, AAAI/IAAI.
[17] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[18] O. Hernández-Lerma,et al. Discrete-time Markov control processes , 1999 .
[19] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[20] Thomas G. Dietterich,et al. Efficient Value Function Approximation Using Regression Trees , 1999 .
[21] Junichiro Yoshimoto,et al. Application of reinforcement learning to balancing of Acrobot , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).
[22] L. Breiman. SOME INFINITY THEORY FOR PREDICTOR ENSEMBLES , 2000 .
[23] Peter W. Glynn,et al. Kernel-Based Reinforcement Learning in Average-Cost Problems: An Application to Optimal Portfolio Choice , 2000, NIPS.
[24] Leslie Pack Kaelbling,et al. Practical Reinforcement Learning in Continuous Spaces , 2000, ICML.
[25] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[26] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[27] Rosaleen J. Anderson. Near optimal closed-loop control Application to electric power systems , 2003 .
[28] J. Langford,et al. Reducing T-step reinforcement learning to classifica-tion , 2003 .
[29] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[30] Pierre Geurts,et al. Iteratively Extending Time Horizon Reinforcement Learning , 2003, ECML.
[31] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[32] Michail G. Lagoudakis,et al. Reinforcement Learning as Classification: Leveraging Modern Classifiers , 2003, ICML.
[33] Leo Breiman,et al. Random Forests , 2001, Machine Learning.
[34] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[35] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[36] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[37] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[38] Justin A. Boyan,et al. Technical Update: Least-Squares Temporal Difference Learning , 2002, Machine Learning.
[39] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[40] D. Ernst,et al. Approximate Value Iteration in the Reinforcement Learning Context. Application to Electrical Power System Control. , 2005 .
[41] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[42] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[43] Pierre Geurts,et al. Extremely randomized trees , 2006, Machine Learning.
[44] Yi Lin,et al. Random Forests and Adaptive Nearest Neighbors , 2006 .
[45] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.