Leader-Follower semi-Markov Decision Problems: Theoretical Framework and Approximate Solution
暂无分享,去创建一个
S. Bhattacharyya | K. Tharakunnel | Kurian K. Tharakunnel | S. Bhattacharyya | K. Tharakunnel | Siddhartha Bhattacharyya
[1] J. Filar,et al. Competitive Markov Decision Processes , 1996 .
[2] Jose B. Cruz,et al. Optimal and Near-Optimal Incentive Strategies in the Hierarchical Control of Markov Chains , 1983 .
[3] Michael P. Wellman,et al. Nash Q-Learning for General-Sum Stochastic Games , 2003, J. Mach. Learn. Res..
[4] D. Fudenberg,et al. The Theory of Learning in Games , 1998 .
[5] Fernando Bernstein,et al. Coordinating Supply Chains with Simple Pricing Schemes: The Role of Vendor-Managed Inventories , 2006, Manag. Sci..
[6] David S. Leslie,et al. Individual Q-Learning in Normal Form Games , 2005, SIAM J. Control. Optim..
[7] Harri Ehtamo,et al. Recent Studies on Incentive Design Problems in Game Theory and Management Science , 2002 .
[8] E. J. Collins,et al. Convergent multiple-timescales reinforcement learning algorithms in normal form games , 2003 .
[9] S. Marcus,et al. Multi-time Scale Markov Decision Processes , 2005 .
[10] Y. Narahari,et al. Design of Incentive Compatible Mechanisms for Stackelberg Problems , 2005, WINE.
[11] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[12] T. Başar,et al. Dynamic Noncooperative Game Theory , 1982 .
[13] S. Mahadevan,et al. Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning , 1999 .
[14] A. Keyhani. Leader-Follower Framework for Control of Energy Services , 2002, IEEE Power Engineering Review.
[15] R. Radner. Repeated Principal-Agent Games with Discounting , 1985 .
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[18] Manuela M. Veloso,et al. Multiagent learning using a variable learning rate , 2002, Artif. Intell..
[19] Erica L. Plambeck,et al. Performance-Based Incentives in a Dynamic Principal-Agent Model , 2000, Manuf. Serv. Oper. Manag..
[20] V. Borkar. Stochastic approximation with two time scales , 1997 .
[21] Mark A. Shayman,et al. Multitime scale Markov decision processes , 2003, IEEE Trans. Autom. Control..
[22] Abhijit Gosavi,et al. Reinforcement learning for long-run average cost , 2004, Eur. J. Oper. Res..
[23] Jose B. Cruz,et al. An incentive model of duopoly with government coordination , 1981, Autom..
[24] T. Başar,et al. Incentive-Based Pricing for Network Games with Complete and Incomplete Information , 2007 .