Distributed asynchronous policy iteration in dynamic programming
暂无分享,去创建一个
[1] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[2] D. Bertsekas. The auction algorithm: A distributed relaxation method for the assignment problem , 1988 .
[3] F. Robert. Contraction en norme vectorielle: Convergence d'iterations chaotiques pour des equations non linéaires de point fixe à plusieurs variables , 1976 .
[4] M. Tarazi. Some convergence results for asynchronous algorithms , 1982 .
[5] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[6] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[7] D. Bertsekas,et al. Partially asynchronous, parallel algorithms for network flow and other problems , 1990 .
[8] Dimitri P. Bertsekas,et al. Dual coordinate step methods for linear network flow problems , 1988, Math. Program..
[9] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[10] Didier El Baz,et al. Asynchronous Iterative Algorithms with Flexible Communication for Nonlinear Network Flow Problems , 1996, J. Parallel Distributed Comput..
[11] V. Borkar. Asynchronous Stochastic Approximations , 1998 .
[12] Dimitri P. Bertsekas,et al. Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, CDC.
[13] Ronald J. Williams,et al. Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding Actor-Cr , 1993 .
[14] John N. Tsitsiklis,et al. On the stability of asynchronous iterative processes , 1986, 1986 25th IEEE Conference on Decision and Control.
[15] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[16] Dimitri P. Bertsekas,et al. Q-learning and enhanced policy iteration in discounted dynamic programming , 2010, 49th IEEE Conference on Decision and Control (CDC).
[17] Dimitri Bertsekas,et al. Distributed dynamic programming , 1981, 1981 20th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.
[18] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[19] Dimitri P. Bertsekas,et al. Distributed asynchronous computation of fixed points , 1983, Math. Program..
[20] Richard S. Sutton,et al. Reinforcement Learning , 1992, Handbook of Machine Learning.
[21] D. Bertsekas,et al. Distributed asynchronous relaxation methods for convex network flow problems , 1987 .
[22] J. C. Miellou,et al. Algorithmes de relaxation chaotique à retards , 1975 .
[23] John N. Tsitsiklis,et al. Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives , 1999, IEEE Trans. Autom. Control..
[24] D. Bertsekas. Monotone Mappings with Application in Dynamic Programming , 1977 .
[25] J. Walrand,et al. Distributed Dynamic Programming , 2022 .
[26] Gérard M. Baudet,et al. Asynchronous Iterative Methods for Multiprocessors , 1978, JACM.
[27] Patrizia Beraldi,et al. A Parallel Asynchronous Implementation of the e-Relaxation Method for the Linear Minimum Cost Flow Problem , 1997, Parallel Comput..
[28] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.
[29] Dimitri P. Bertsekas,et al. Parallel synchronous and asynchronous implementations of the auction algorithm , 1991, Parallel Comput..