Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays
暂无分享,去创建一个
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Tsuneo Yoshikawa,et al. Ground-space bilateral teleoperation of ETS-VII robot arm by direct bilateral coupling under 7-s time delay condition , 2004, IEEE Transactions on Robotics and Automation.
[3] Mridul Agarwal,et al. Blind Decision Making: Reinforcement Learning with Delayed Observations , 2020, ICAPS.
[4] Hartmut Logemann,et al. Destabilizing effects of small time delays on feedback-controlled descriptor systems☆ , 1998 .
[5] Thomas J. Walsh,et al. Planning and Learning in Environments with Delayed Feedback , 2007, ECML.
[6] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[7] Eitan Altman,et al. Congestion control as a stochastic control problem with action delays , 1999, Autom..
[8] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[9] Chelsea C. WhiteIII. Note on “A Partially Observable Markov Decision Process with Lagged Information” , 1988 .
[10] Elizabeth Gibney,et al. Google AI algorithm masters ancient game of Go , 2016, Nature.
[11] P J Beek,et al. Theoretical analysis of destabilization resonances in time-delayed stochastic second-order dynamical systems and some implications for human motor control. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.
[12] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[13] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[14] Milind Tambe,et al. Test sensitivity is secondary to frequency and turnaround time for COVID-19 surveillance , 2020, medRxiv : the preprint server for health sciences.
[15] Robert Babuska,et al. Control delay in Reinforcement Learning for real-time dynamic systems: A memoryless approach , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[16] Maolin Jin,et al. Robust Compliant Motion Control of Robot With Nonlinear Friction Using Time-Delay Estimation , 2008, IEEE Transactions on Industrial Electronics.
[17] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[18] Chelsea C. White,et al. Markov decision processes with noise-corrupted and delayed state observations , 1999, J. Oper. Res. Soc..
[19] Shinhong Kim,et al. A Partially Observable Markov Decision Process with Lagged Information , 1987 .
[20] M. Battegay,et al. Reproductive number of the COVID-19 epidemic in Switzerland with a focus on the Cantons of Basel-Stadt and Basel-Landschaft. , 2020, Swiss medical weekly.
[21] M. Fan,et al. Effect of delay in diagnosis on transmission of COVID-19. , 2020, Mathematical biosciences and engineering : MBE.
[22] Jing Zhao,et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus–Infected Pneumonia , 2020, The New England journal of medicine.
[23] Cornelius T. Leondes,et al. Technical Note - Markov Decision Processes with State-Information Lag , 1972, Oper. Res..
[24] Thomas J. Walsh,et al. Learning and planning in environments with delayed feedback , 2009, Autonomous Agents and Multi-Agent Systems.
[25] Konstantinos V. Katsikopoulos,et al. Markov decision processes with delays and asynchronous cost collection , 2003, IEEE Trans. Autom. Control..
[26] Dennis Huisman,et al. Delay Propagation and Delay Management in Transportation Networks , 2018 .
[27] Jonathan Binas,et al. Reinforcement Learning with Random Delays , 2021, ICLR.
[28] Liang Li,et al. Delay-Aware Model-Based Reinforcement Learning for Continuous Control , 2020, Neurocomputing.
[29] S. Kim. State information lag markov decision process with control limit rule , 1985 .
[30] Karol Hausman,et al. Thinking While Moving: Deep Reinforcement Learning with Concurrent Control , 2020, ICLR.
[31] A. Markman,et al. The Curse of Planning: Dissecting Multiple Reinforcement-Learning Systems by Taxing the Central Executive , 2013 .
[32] Chris Pal,et al. Real-Time Reinforcement Learning , 2019, NeurIPS.
[33] Eitan Altman,et al. Closed-loop control with delayed information , 1992, SIGMETRICS '92/PERFORMANCE '92.
[34] Wotao Yin,et al. On Unbounded Delays in Asynchronous Parallel Fixed-Point Algorithms , 2016, J. Sci. Comput..
[35] Shie Mannor,et al. Acting in Delayed Environments with Non-Stationary Markov Policies , 2021, ICLR.