Q-learning for Markov decision processes with a satisfiability criterion
暂无分享,去创建一个
[1] D. Leslie,et al. Asynchronous stochastic approximation with differential inclusions , 2011, 1112.2288.
[2] Abhijeet Bhorkar,et al. An on-line learning algorithm for energy efficient delay constrained scheduling over a fading channel , 2008, IEEE Journal on Selected Areas in Communications.
[3] J. Aubin,et al. Differential inclusions set-valued maps and viability theory , 1984 .
[4] Daniel H. Wagner. Survey of Measurable Selection Theorems , 1977 .
[5] Shalabh Bhatnagar,et al. The Borkar-Meyn theorem for asynchronous stochastic approximations , 2011, Syst. Control. Lett..
[6] Vivek S. Borkar,et al. Structural Properties of Optimal Transmission Policies Over a Randomly Varying Channel , 2008, IEEE Transactions on Automatic Control.
[7] D. Blackwell. An analog of the minimax theorem for vector payoffs. , 1956 .
[8] Sean P. Meyn,et al. The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..
[9] Emanuel Milman. Approachable sets of vector payoffs in stochastic games , 2006, Games Econ. Behav..
[10] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[11] Sanjeev Arora,et al. The Multiplicative Weights Update Method: a Meta-Algorithm and Applications , 2012, Theory Comput..
[12] Schäl Manfred. Estimation and control in discounted stochastic dynamic programming , 1987 .
[13] Vivek S. Borkar,et al. Approachability in Stackelberg Stochastic Games with Vector Costs , 2017, Dyn. Games Appl..
[14] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..
[15] A. Shwartz,et al. Guaranteed performance regions in Markovian systems with competing decision makers , 1993, IEEE Trans. Autom. Control..
[16] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[17] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[18] William H. Sandholm,et al. The projection dynamic and the geometry of population games , 2008, Games Econ. Behav..
[19] V. Borkar. Stochastic approximation with two time scales , 1997 .
[20] Josef Hofbauer,et al. Evolutionary Games and Population Dynamics , 1998 .
[21] Anna Nagurney,et al. Dynamical systems and variational inequalities , 1993, Ann. Oper. Res..
[22] William H. Sandholm,et al. Population Games And Evolutionary Dynamics , 2010, Economic learning and social evolution.
[23] V. Borkar. Asynchronous Stochastic Approximations , 1998 .
[24] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions, Part II: Applications , 2006, Math. Oper. Res..