TD algorithm for the variance of return and mean-variance reinforcement learning
暂无分享,去创建一个
[1] E. Elton. Modern portfolio theory and investment analysis , 1981 .
[2] D. White. Mean, variance, and probabilistic criteria in finite Markov decision processes: A review , 1988 .
[3] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[4] Matthias Heger,et al. Consideration of risk in reinformance learning , 1994, ICML 1994.
[5] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[6] Matthias Heger,et al. Consideration of Risk in Reinforcement Learning , 1994, ICML.
[7] E. Fernández-Gaucherand,et al. Non-standard optimality criteria for stochastic control problems , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[8] Thomas G. Dietterich,et al. High-Performance Job-Shop Scheduling With A Time-Delay TD(λ) Network , 1995, NIPS 1995.
[9] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[10] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[11] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[12] Daniel Hernández-Hernández,et al. Risk Sensitive Markov Decision Processes , 1997 .
[13] Ralph Neuneier,et al. Enhancing Q-Learning for Optimal Asset Allocation , 1997, NIPS.
[14] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[15] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[16] Timothy X. Brown,et al. Optimizing Admission Control while Ensuring Quality of Service in Multimedia Networks via Reinforcement Learning , 1998, NIPS.
[17] Andrew W. Moore,et al. Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems , 1999, IJCAI.
[18] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[19] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[20] SRIDHAR MAHADEVAN,et al. Average Reward Reinforcement Learning: Foundations, Algorithms, and Empirical Results , 2005, Machine Learning.