暂无分享,去创建一个
Marc G. Bellemare | Aaron C. Courville | Rishabh Agarwal | Aaron Courville | Max Schwarzer | Pablo Samuel Castro | Rishabh Agarwal | P. S. Castro | Max Schwarzer
[1] Tal Arbel,et al. Accounting for Variance in Machine Learning Benchmarks , 2021, MLSys.
[2] Matteo Hessel,et al. When to use parametric models in reinforcement learning? , 2019, NeurIPS.
[3] Mario Lucic,et al. Are GANs Created Equal? A Large-Scale Study , 2017, NeurIPS.
[4] Pierre-Yves Oudeyer,et al. A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms , 2019, RML@ICLR.
[5] Gerd Gigerenzer,et al. Statistical Rituals: The Replication Delusion and How We Got There , 2018, Advances in Methods and Practices in Psychological Science.
[6] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[7] H. B. Mann,et al. On a Test of Whether one of Two Random Variables is Stochastically Larger than the Other , 1947 .
[8] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[9] M. Baker. 1,500 scientists lift the lid on reproducibility , 2016, Nature.
[10] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[11] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[12] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[13] Tom Schaul,et al. Return-based Scaling: Yet Another Normalisation Trick for Deep RL , 2021, ArXiv.
[14] Houqiang Li,et al. Masked Contrastive Representation Learning for Reinforcement Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Jinwoo Shin,et al. State Entropy Maximization with Random Encoders for Efficient Exploration , 2021, ICML.
[16] Iryna Gurevych,et al. Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.
[17] Wee Sun Lee,et al. Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning , 2021, ECML/PKDD.
[18] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[19] Alexander D'Amour,et al. The MultiBERTs: BERT Reproductions for Robustness Analysis , 2021, ArXiv.
[20] Animesh Garg,et al. D2RL: Deep Dense Architectures in Reinforcement Learning , 2020, ArXiv.
[21] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[22] B. L. Welch. The generalisation of student's problems when several different population variances are involved. , 1947, Biometrika.
[23] Jimmy J. Lin,et al. Significant Improvements over the State of the Art? A Case Study of the MS MARCO Document Ranking Leaderboard , 2021, SIGIR.
[24] S. Goodman,et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations , 2016, European Journal of Epidemiology.
[25] Mauro Birattari,et al. How to assess and report the performance of a stochastic algorithm on a benchmark problem: mean or best result on a number of runs? , 2007, Optim. Lett..
[26] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[27] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.
[28] J. Tukey. A survey of sampling from contaminated distributions , 1960 .
[29] Pieter Abbeel,et al. Behavior From the Void: Unsupervised Active Pre-Training , 2021, ArXiv.
[30] John Schulman,et al. Phasic Policy Gradient , 2020, ICML.
[31] Fabien Moutarde,et al. Is Deep Reinforcement Learning Really Superhuman on Atari? Leveling the playing field , 2019 .
[32] Joelle Pineau,et al. Improving Sample Efficiency in Model-Free Reinforcement Learning from Images , 2019, AAAI.
[33] Greg Wayne,et al. Synthetic Returns for Long-Term Credit Assignment , 2021, ArXiv.
[34] David Hinkley,et al. Bootstrap Methods: Another Look at the Jackknife , 2008 .
[35] Jiashi Feng,et al. Improving Generalization in Reinforcement Learning with Mixture Regularization , 2020, NeurIPS.
[36] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[37] Edward Grefenstette,et al. Prioritized Level Replay , 2020, ICML.
[38] Samuel Ritter,et al. Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study , 2017, ICML.
[39] Pieter Abbeel,et al. APS: Active Pretraining with Successor Features , 2021, ICML.
[40] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[41] Sameera S. Ponda,et al. Autonomous navigation of stratospheric balloons using reinforcement learning , 2020, Nature.
[42] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[43] Sergey Levine,et al. Model-Based Reinforcement Learning for Atari , 2019, ICLR.
[44] F. Götze,et al. RESAMPLING FEWER THAN n OBSERVATIONS: GAINS, LOSSES, AND REMEDIES FOR LOSSES , 2012 .
[45] Joelle Pineau,et al. Improving Reproducibility in Machine Learning Research (A Report from the NeurIPS 2019 Reproducibility Program) , 2020, J. Mach. Learn. Res..
[46] Pieter Abbeel,et al. Reinforcement Learning with Augmented Data , 2020, NeurIPS.
[47] András Lörincz,et al. Learning Tetris Using the Noisy Cross-Entropy Method , 2006, Neural Computation.
[48] Pascal Vincent,et al. Unreproducible Research is Reproducible , 2019, ICML.
[49] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[50] Ilya Kostrikov,et al. Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels , 2020, ArXiv.
[51] Scott M. Jordan,et al. Evaluating the Performance of Reinforcement Learning Algorithms , 2020, ICML.
[52] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.
[53] Honglak Lee,et al. Predictive Information Accelerates Learning in RL , 2020, NeurIPS.
[54] Rob Fergus,et al. Decoupling Value and Policy for Generalization in Reinforcement Learning , 2021, ICML.
[55] Pieter Abbeel,et al. CURL: Contrastive Unsupervised Representations for Reinforcement Learning , 2020, ICML.
[56] Balaraman Ravindran,et al. SEERL: Sample Efficient Ensemble Reinforcement Learning , 2021, AAMAS.
[57] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[58] Veronika Cheplygina,et al. How I failed machine learning in medical imaging - shortcomings and recommendations , 2021, ArXiv.
[59] B. Efron. Better Bootstrap Confidence Intervals , 1987 .
[60] Marc G. Bellemare,et al. Dopamine: A Research Framework for Deep Reinforcement Learning , 2018, ArXiv.
[61] J. Schulman,et al. Leveraging Procedural Generation to Benchmark Reinforcement Learning , 2019, ICML.
[62] Rotem Dror,et al. Deep Dominance - How to Properly Compare Deep Neural Models , 2019, ACL.
[63] John Hallam,et al. A Survey on Reproducibility by Evaluating Deep Reinforcement Learning Algorithms on Real-World Robots , 2019, CoRL.
[64] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[65] David Silver,et al. Muesli: Combining Improvements in Policy Optimization , 2021, ICML.
[66] Lukasz Kaiser,et al. Q-Value Weighted Regression: Reinforcement Learning with Limited Data , 2021, 2022 International Joint Conference on Neural Networks (IJCNN).
[67] Mohammad Norouzi,et al. Mastering Atari with Discrete World Models , 2020, ICLR.
[68] Ilya Kostrikov,et al. Automatic Data Augmentation for Generalization in Deep Reinforcement Learning , 2020, ArXiv.
[69] H. Levy. Stochastic dominance and expected utility: survey and analysis , 1992 .
[70] Sara Hooker,et al. Randomness In Neural Network Training: Characterizing The Impact of Tooling , 2021, MLSys.
[71] Daniel Guo,et al. Agent57: Outperforming the Atari Human Benchmark , 2020, ICML.
[72] Peter Stone,et al. Deterministic Implementations for Reproducibility in Deep Reinforcement Learning , 2018, ArXiv.
[73] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[74] Demis Hassabis,et al. Mastering Atari, Go, chess and shogi by planning with a learned model , 2019, Nature.
[75] David Gal,et al. Abandon Statistical Significance , 2017, The American Statistician.
[76] D. Romer,et al. In Praise of Confidence Intervals , 2020, AEA Papers and Proceedings.
[77] Sergey Levine,et al. Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model , 2019, NeurIPS.
[78] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[79] Kacper Kielak. Do recent advancements in model-based deep reinforcement learning really improve data efficiency? , 2019 .
[80] Kenneth O. Stanley,et al. Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.
[81] Fernando Diaz,et al. The Benchmark Lottery , 2021, ArXiv.
[82] Mohammad Norouzi,et al. Dream to Control: Learning Behaviors by Latent Imagination , 2019, ICLR.
[83] Mohammad Norouzi,et al. An Optimistic Perspective on Offline Reinforcement Learning , 2020, ICML.
[84] Oleksii Hrinchuk,et al. Catalyst.RL: A Distributed Framework for Reproducible RL Research , 2019, ArXiv.
[85] O. Pietquin,et al. Munchausen Reinforcement Learning , 2020, NeurIPS.
[86] J. Ioannidis. Why Most Published Research Findings Are False , 2005, PLoS medicine.
[87] Pablo Samuel Castro,et al. Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research , 2021, ICML.
[88] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[89] John P. A. Ioannidis,et al. What does research reproducibility mean? , 2016, Science Translational Medicine.
[90] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents , 2017, J. Artif. Intell. Res..
[91] David Warde-Farley,et al. Fast Task Inference with Variational Intrinsic Successor Features , 2019, ICLR.
[92] Pieter Abbeel,et al. SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning , 2021, ICML.
[93] Ali Farhadi,et al. Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.
[94] John Foley,et al. Let's Play Again: Variability of Deep Reinforcement Learning Agents in Atari Environments , 2019, ArXiv.
[95] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[96] Pierre-Yves Oudeyer,et al. How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments , 2018, ArXiv.
[97] Nenghai Yu,et al. Return-Based Contrastive Representation Learning for Reinforcement Learning , 2021, ICLR.
[98] John F. Canny,et al. Measuring the Reliability of Reinforcement Learning Algorithms , 2019, ICLR.
[99] N. Lazar,et al. Moving to a World Beyond “p < 0.05” , 2019, The American Statistician.
[100] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[101] R Devon Hjelm,et al. Data-Efficient Reinforcement Learning with Self-Predictive Representations , 2020 .
[102] Sander Greenland,et al. Scientists rise up against statistical significance , 2019, Nature.
[103] Ankush Gupta,et al. Unsupervised Learning of Object Keypoints for Perception and Control , 2019, NeurIPS.
[104] Jorge J. Moré,et al. Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .
[105] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[106] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[107] Jakub W. Pachocki,et al. Dota 2 with Large Scale Deep Reinforcement Learning , 2019, ArXiv.
[108] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.