Transition Based Discount Factor for Model Free Algorithms in Reinforcement Learning
暂无分享,去创建一个
K. Lakshmanan | Ruchir Gupta | Abhinav Sharma | Atul Gupta | K. Lakshmanan | Ruchir Gupta | Atul Gupta | Abhinav Sharma
[1] Onder Tutsoy,et al. Chaotic dynamics and convergence analysis of temporal difference algorithms with bang‐bang control , 2016 .
[2] Tomás Prieto-Rumeau,et al. Discrete-time control with non-constant discount factor , 2020, Math. Methods Oper. Res..
[3] Martin Brown,et al. Reinforcement learning analysis for a minimum time balance problem , 2016 .
[4] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[5] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[6] Xianping Guo,et al. Markov decision processes with state-dependent discount factors and unbounded rewards/costs , 2011, Oper. Res. Lett..
[7] Ling Shi,et al. Deep Reinforcement Learning for Wireless Sensor Scheduling in Cyber-Physical Systems , 2018, Autom..
[8] Shimon Whiteson,et al. Learning Retrospective Knowledge with Reverse Reinforcement Learning , 2020, NeurIPS.
[9] Elif Surer,et al. Using Generative Adversarial Nets on Atari Games for Feature Extraction in Deep Reinforcement Learning , 2020, 2020 28th Signal Processing and Communications Applications Conference (SIU).
[10] Damien Ernst,et al. How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies , 2015, ArXiv.
[11] K. Doya,et al. The Role of Serotonin in the Regulation of Patience and Impulsivity , 2012, Molecular Neurobiology.
[12] John Stachurski,et al. Dynamic programming with state-dependent discounting , 2019, J. Econ. Theory.
[13] Shuang Li,et al. Reinforcement Learning Approach to Design Practical Adaptive Control for a Small-Scale Intelligent Vehicle , 2019, Symmetry.
[14] Min Oh,et al. Deep reinforcement learning optimization framework for a power generation plant considering performance and environmental issues , 2021 .
[15] Daniel Kroening,et al. Cautious Reinforcement Learning with Logical Constraints , 2020, AAMAS.
[16] Daniel Kroening,et al. Deep Reinforcement Learning with Temporal Logics , 2020, FORMATS.
[17] Silviu Pitis,et al. Rethinking the Discount Factor in Reinforcement Learning: A Decision Theoretic Approach , 2019, AAAI.
[18] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[19] Qi Sun,et al. Hierarchical Reinforcement Learning for Self-Driving Decision-Making without Reliance on Labeled Driving Data , 2020, IET Intelligent Transport Systems.
[20] Juan González-Hernández,et al. Markov control processes with randomized discounted cost , 2007, Math. Methods Oper. Res..
[21] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[22] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[23] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[24] Rajesh Elara Mohan,et al. Complete coverage path planning using reinforcement learning for Tetromino based cleaning and maintenance robot , 2020 .