Reduced variance deep reinforcement learning with temporal logic specifications
暂无分享,去创建一个
Michael M. Zavlanos | Yan Zhang | Davood Hajinezhad | Yiannis Kantaros | Qitong Gao | M. Zavlanos | Yan Zhang | Davood Hajinezhad | Y. Kantaros | Qitong Gao
[1] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[2] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[3] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[4] Jing Wang,et al. Temporal logic motion control using actor-critic methods , 2012, ICRA.
[5] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .
[6] B. V. Dean,et al. Studies in Linear and Non-Linear Programming. , 1959 .
[7] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[8] S. Shankar Sastry,et al. A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications , 2014, 53rd IEEE Conference on Decision and Control.
[9] Michael M. Zavlanos,et al. Probabilistic Motion Planning Under Temporal Tasks and Soft Constraints , 2017, IEEE Transactions on Automatic Control.
[10] Calin Belta,et al. Reinforcement learning with temporal logic rewards , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[11] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[12] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[13] Calin Belta,et al. Incremental controller synthesis in probabilistic environments with temporal logic constraints , 2014, Int. J. Robotics Res..
[14] Alexander J. Smola,et al. Fast Incremental Method for Nonconvex Optimization , 2016, ArXiv.
[15] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[16] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .
[17] Calin Belta,et al. Optimal Control of Markov Decision Processes With Linear Temporal Logic Constraints , 2014, IEEE Transactions on Automatic Control.
[18] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Calin Belta,et al. A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks , 2017, 2018 Annual American Control Conference (ACC).
[21] Calin Belta,et al. Formal Methods for Discrete-Time Dynamical Systems , 2017 .
[22] Yi Zhou,et al. An optimal randomized incremental gradient method , 2015, Mathematical Programming.
[23] Dimos V. Dimarogonas,et al. Multi-agent plan reconfiguration under local LTL specifications , 2015, Int. J. Robotics Res..
[24] Ufuk Topcu,et al. Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints , 2014, Robotics: Science and Systems.
[25] B. V. Dean,et al. Studies in Linear and Non-Linear Programming. , 1959 .
[26] Zeyuan Allen Zhu,et al. Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.
[27] Emilio Frazzoli,et al. Sampling-based algorithms for optimal motion planning , 2011, Int. J. Robotics Res..
[28] Lihong Li,et al. Stochastic Variance Reduction Methods for Policy Evaluation , 2017, ICML.
[29] Michael M. Zavlanos,et al. Sampling-Based Optimal Control Synthesis for Multirobot Systems Under Global Temporal Tasks , 2017, IEEE Transactions on Automatic Control.
[30] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[31] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[32] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[33] Zhaoran Wang,et al. NESTT: A Nonconvex Primal-Dual Splitting Method for Distributed and Stochastic Optimization , 2016, NIPS.
[34] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[35] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[36] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.
[37] Michael M. Zavlanos,et al. Distributed Optimal Control Synthesis for Multi-Robot Systems under Global Temporal Tasks , 2018, 2018 ACM/IEEE 9th International Conference on Cyber-Physical Systems (ICCPS).
[38] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[39] Christel Baier,et al. Principles of model checking , 2008 .
[40] Dimitri P. Bertsekas,et al. Incremental Gradient, Subgradient, and Proximal Methods for Convex Optimization: A Survey , 2015, ArXiv.