Non-Markovian Policies in Sequential Decision Problems
暂无分享,去创建一个
[1] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[2] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[3] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[4] S. Verdú,et al. Abstract dynamic programming models under commutativity conditions , 1987 .
[5] Onésimo Hernández-Lerma,et al. Controlled Markov Processes , 1965 .
[6] D. Bertsekas. Monotone Mappings with Application in Dynamic Programming , 1977 .
[7] Csaba Szepesvari,et al. Module Based Reinforcement Learning for a Real Robot , 1997 .
[8] Csaba Szepesvári,et al. Learning and Exploitation Do Not Conflict Under Minimax Optimality , 1997, ECML.