Using EM for Reinforcement Learning
暂无分享,去创建一个
[1] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[2] F. Downton. Stochastic Approximation , 1969, Nature.
[3] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[4] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[5] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[6] P. Anandan,et al. Pattern-recognizing stochastic learning automata , 1985, IEEE Transactions on Systems, Man, and Cybernetics.
[7] Geoffrey J. McLachlan,et al. Mixture models : inference and applications to clustering , 1989 .
[8] Kumpati S. Narendra,et al. Learning automata - an introduction , 1989 .
[9] Geoffrey E. Hinton. Connectionist Learning Procedures , 1989, Artif. Intell..
[10] 永福 智志. The Organization of Learning , 2005, Journal of Cognitive Neuroscience.
[11] Michael I. Jordan,et al. Reinforcement Learning by Probability Matching , 1995, NIPS 1995.