PID Accelerated Value Iteration Algorithm
暂无分享,去创建一个
[1] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[2] Thibault Langlois,et al. Parameter adaptation in stochastic optimization , 1999 .
[3] Harold J. Kushner,et al. Accelerated procedures for the solution of discrete Markov control problems , 1971 .
[4] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.
[5] Ronald J. Williams,et al. Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions , 1993 .
[6] Ian Postlethwaite,et al. Multivariable Feedback Control: Analysis and Design , 1996 .
[7] Hilbert J. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[8] E. Yaz,et al. Linear optimal control, H2 and H∞ methods, by Jeffrey B. Burl, Addison Wesley Longman, Inc. Menlo Park, CA, 1999 , 2000 .
[9] Jing Peng,et al. Efficient Learning and Planning Within the Dyna Framework , 1993, Adapt. Behav..
[10] Marcello Restelli,et al. Boosted Fitted Q-Iteration , 2017, ICML.
[11] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[12] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[13] J. Doyle,et al. Essentials of Robust Control , 1997 .
[14] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[15] V. Berinde. Iterative Approximation of Fixed Points , 2007 .
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Richard S. Sutton,et al. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta , 1992, AAAI.
[18] Matthieu Geist,et al. Anderson Acceleration for Reinforcement Learning , 2018, EWRL 2018.
[19] Mark W. Schmidt,et al. Online Learning Rate Adaptation with Hypergradient Descent , 2017, ICLR.
[20] Kevin D. Seppi,et al. Prioritization Methods for Accelerating MDP Solvers , 2005, J. Mach. Learn. Res..
[21] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[22] Shie Mannor,et al. Regularized Fitted Q-Iteration for planning in continuous-space Markovian decision problems , 2009, 2009 American Control Conference.
[23] Nan Jiang,et al. Information-Theoretic Considerations in Batch Reinforcement Learning , 2019, ICML.
[24] R. Jungers. The Joint Spectral Radius: Theory and Applications , 2009 .
[25] Peter Kuster,et al. Nonlinear And Adaptive Control Design , 2016 .
[26] Rémi Munos,et al. Performance Bounds in Lp-norm for Approximate Value Iteration , 2007, SIAM J. Control. Optim..
[27] Dimitri P. Bertsekas,et al. Stochastic optimal control : the discrete time case , 2007 .
[28] B. Pasik-Duncan,et al. Adaptive Control , 1996, IEEE Control Systems.
[29] Bruno Scherrer,et al. Momentum in Reinforcement Learning , 2020, AISTATS.
[30] Donald G. M. Anderson. Iterative Procedures for Nonlinear Integral Equations , 1965, JACM.
[31] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[32] Csaba Szepesvári,et al. Finite-Time Bounds for Fitted Value Iteration , 2008, J. Mach. Learn. Res..
[33] Sean P. Meyn,et al. Zap Q-Learning , 2017, NIPS.
[34] Naresh K. Sinha,et al. Modern Control Systems , 1981, IEEE Transactions on Systems, Man, and Cybernetics.
[35] P. N. Paraskevopoulos,et al. Modern Control Engineering , 2001 .
[36] Gao Huang,et al. Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning , 2019, NeurIPS.
[37] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[38] Patrick M. Pilarski,et al. Tuning-free step-size adaptation , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[39] Geoffrey J. Gordon,et al. Fast Exact Planning in Markov Decision Processes , 2005, ICAPS.
[40] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[41] Vineet Goyal,et al. A First-Order Approach to Accelerated Value Iteration , 2019, Oper. Res..