暂无分享,去创建一个
Pierre-Yves Oudeyer | Olivier Sigaud | Cédric Colas | Pierre-Yves Oudeyer | Olivier Sigaud | Cédric Colas | P. Oudeyer
[1] B. L. Welch. The generalisation of student's problems when several different population variances are involved. , 1947, Biometrika.
[2] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[3] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[4] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[5] Welch Bl. THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM WHEN SEVERAL DIFFERENT POPULATION VARLANCES ARE INVOLVED , 1947 .
[6] W. Rice. ANALYZING TABLES OF STATISTICAL TESTS , 1989, Evolution; international journal of organic evolution.
[7] Elman Mansimov,et al. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.
[8] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[9] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[10] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[11] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.