Improved Feature Learning: A Maximum-Average-Out Deep Neural Network for the Game Go
暂无分享,去创建一个
Xiali Li | Zheng Wang | Bo Liu | Licheng Wu | Zhengyu Lv
[1] Xia Chen,et al. A Stochastic Sampling Mechanism for Time-Varying Formation of Multiagent Systems With Multiple Leaders and Communication Delays , 2019, IEEE Transactions on Neural Networks and Learning Systems.
[2] Pierre Baldi,et al. The dropout learning algorithm , 2014, Artif. Intell..
[3] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4] Jiahui Bai,et al. On the Observability of Leader-Based Multiagent Systems with Fixed Topology , 2019, Complex..
[5] Wenbing Zhao,et al. A robust multilayer extreme learning machine using kernel risk-sensitive loss criterion , 2020, Int. J. Mach. Learn. Cybern..
[6] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[7] Qiang Chen,et al. Network In Network , 2013, ICLR.
[8] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1991, Machine Learning.
[9] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[10] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[11] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[12] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[13] Stephen Tyree,et al. Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU , 2016, ICLR.
[14] Tuomas Sandholm,et al. Safe and Nested Subgame Solving for Imperfect-Information Games , 2017, NIPS.
[15] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.
[16] Sergey Levine,et al. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.
[17] Xiong Luo,et al. Short-Term Wind Speed Forecasting via Stacked Extreme Learning Machine With Generalized Correntropy , 2018, IEEE Transactions on Industrial Informatics.
[18] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[21] Razvan Pascanu,et al. Relational Deep Reinforcement Learning , 2018, ArXiv.
[22] Noam Brown,et al. Superhuman AI for heads-up no-limit poker: Libratus beats top professionals , 2018, Science.
[23] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[24] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[25] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[26] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[27] Marc G. Bellemare,et al. Distributional Reinforcement Learning with Quantile Regression , 2017, AAAI.
[28] Yuandong Tian,et al. ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games , 2017, NIPS.