Consideration of Risk in Reinforcement Learning
暂无分享,去创建一个
[1] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[2] Sven Koenig,et al. Utility-Based Planning , 1993 .
[3] Sebastian Thrun,et al. Efficient Exploration In Reinforcement Learning , 1992 .
[4] C. Watkins. Learning from delayed rewards , 1989 .
[5] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[6] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[7] D. Bertsekas,et al. On the minimax feedback control of uncertain dynamic systems , 1971, CDC 1971.
[8] H. S. Witsenhausen,et al. Minimax Controls of Uncertain Systems , 1966 .
[9] J. Neumann,et al. Theory of Games and Economic Behavior. , 1945 .