Profit Sharing Using a Dynamic Reinforcement Function Considering Expectation Value of Reinforcement
暂无分享,去创建一个
Takashi Okamoto | Hironori Hirata | Seiichi Koakutsu | Daisuke Tamashima | H. Hirata | S. Koakutsu | T. Okamoto | Daisuke Tamashima
[1] Hidehiro Nakano,et al. A reinforcement learning method using a dynamic reinforcement function based on action selection probability , 2007, Systems and Computers in Japan.
[2] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[3] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[4] Shoji Tatsumi,et al. A Profit Sharing Method for Forgetting Past Experiences Effectively , 2006 .
[5] Shoji Tatsumi,et al. About the Reinforcement Function for Profit Sharing , 2004 .