Measuring the Reliability of Reinforcement Learning Algorithms
暂无分享,去创建一个
[1] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[2] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[3] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[4] Shie Mannor,et al. Optimizing the CVaR via Sampling , 2014, AAAI.
[5] Jonathan D. Cryer,et al. Time Series Analysis , 1986 .
[6] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[7] D. Tasche,et al. Expected Shortfall: a natural coherent alternative to Value at Risk , 2001, cond-mat/0105191.
[8] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[9] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.
[10] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[11] Glenn D. Rudebusch. Trends and Random Walks in Macroeconomic Time Series: , 2020, Business Cycles.
[12] H. Beek. F1000Prime recommendation of False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant. , 2012 .
[13] Pierre-Yves Oudeyer,et al. How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments , 2018, ArXiv.
[14] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[15] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[16] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.
[17] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[18] P. Perron,et al. Trends and random walks in macroeconomic time series : Further evidence from a new approach , 1988 .
[19] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[20] D. Dickey,et al. Testing for unit roots in autoregressive-moving average models of unknown order , 1984 .
[21] Robert Tibshirani,et al. Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy , 1986 .
[22] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[23] Peter Stone,et al. Deterministic Implementations for Reproducibility in Deep Reinforcement Learning , 2018, ArXiv.
[24] Pierre-Yves Oudeyer,et al. A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms , 2019, RML@ICLR.
[25] S. Uryasev,et al. Drawdown Measure in Portfolio Optimization , 2003 .
[26] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[27] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[28] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[29] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[30] Nicole Bäuerle,et al. Markov Decision Processes with Average-Value-at-Risk criteria , 2011, Math. Methods Oper. Res..
[31] Mohammad Ghavamzadeh,et al. Algorithms for CVaR Optimization in MDPs , 2014, NIPS.