Decentralized Learning in Finite Markov Chains: Revisited
暂无分享,去创建一个
[1] Apostolos Burnetas,et al. Computing Optimal Policies for Markovian Decision Processes Using Simulation , 1995 .
[2] S. Marcus,et al. Approximate receding horizon approach for Markov decision processes: average reward case , 2003 .
[3] Alfredo García,et al. A Decentralized Approach to Discrete Optimization via Simulation: Application to Network Flow , 2007, Oper. Res..
[4] Hyeong Soo Chang. Finite-Step Approximation Error Bounds for Solving Average-Reward-Controlled Markov Set-Chains , 2008, IEEE Transactions on Automatic Control.
[5] James E. Smith,et al. Structural Properties of Stochastic Dynamic Programs , 2002, Oper. Res..
[6] O. Hernández-Lerma,et al. A forecast horizon and a stopping rule for general Markov decision processes , 1988 .
[7] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[8] K. Narendra,et al. Decentralized learning in finite Markov chains , 1985, 1985 24th IEEE Conference on Decision and Control.
[9] Raúl Montes-de-Oca,et al. Conditions for the uniqueness of optimal policies of discounted Markov decision processes , 2004, Math. Methods Oper. Res..
[10] Donald M. Topkis,et al. Minimizing a Submodular Function on a Lattice , 1978, Oper. Res..
[11] Apostolos Burnetas,et al. On confidence intervals from simulation of finite Markov chains , 1997, Math. Methods Oper. Res..
[12] Robert L. Smith,et al. A Fictitious Play Approach to Large-Scale Optimization , 2005, Oper. Res..
[13] O. Hernández-Lerma. Adaptive Markov Control Processes , 1989 .
[14] R. Weber,et al. Optimal control of service rates in networks of queues , 1987, Advances in Applied Probability.
[15] L. Shapley,et al. Fictitious Play Property for Games with Identical Interests , 1996 .
[16] David D. Yao,et al. Monotone Optimal Control of Permutable GSMPs , 1994, Math. Oper. Res..
[17] Alfredo García,et al. A Game-Theoretic Approach to Efficient Power Management in Sensor Networks , 2008, Oper. Res..
[18] R. Amir. Supermodularity and Complementarity in Economics: An Elementary Survey , 2003 .
[19] William L. Cooper,et al. CONVERGENCE OF SIMULATION-BASED POLICY ITERATION , 2003, Probability in the Engineering and Informational Sciences.
[20] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[21] Xi-Ren Cao,et al. A unified approach to Markov decision problems and performance sensitivity analysis , 2000, at - Automatisierungstechnik.
[22] D. M. Topkis. Supermodularity and Complementarity , 1998 .
[23] R. Amir,et al. A LATTICE-THEORETIC APPROACH TO A CLASS OF DYNAMIC GAMES , 1989 .
[24] J. Robinson. AN ITERATIVE METHOD OF SOLVING A GAME , 1951, Classics in Game Theory.
[25] Richard L. Tweedie,et al. Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.