暂无分享,去创建一个
Marlos C. Machado | Marc G. Bellemare | Erik Talvitie | Joel Veness | Matthew J. Hausknecht | Michael H. Bowling | J. Veness | Erik Talvitie | M. Hausknecht
[1] R. Bellman. Dynamic programming. , 1957, Science.
[2] Stewart W. Wilson. Knowledge Growth in an Artificial Animal , 1985, ICGA.
[3] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[4] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[5] Sebastian Thrun,et al. Lifelong robot learning , 1993, Robotics Auton. Syst..
[6] Leslie Pack Kaelbling,et al. On reinforcement learning for robots , 1996, IROS.
[7] Yishay Mansour,et al. Approximate Planning in Large POMDPs via Reusable Trajectories , 1999, NIPS.
[8] Murray Campbell,et al. Deep Blue , 2002, Artif. Intell..
[9] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[10] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[11] Mark B. Ring. CHILD: A First Step Towards Continual Learning , 1997, Machine Learning.
[12] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[13] Marcus Hutter. Simulation Algorithms for Computational Systems Biology , 2017, Texts in Theoretical Computer Science. An EATCS Series.
[14] Yishay Mansour,et al. Reinforcement Learning in POMDPs Without Resets , 2005, IJCAI.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Andrew Zisserman,et al. Advances in Neural Information Processing Systems (NIPS) , 2007 .
[17] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[18] Jonathan Schaeffer,et al. Checkers Is Solved , 2007, Science.
[19] Michael L. Littman,et al. An analysis of model-based Interval Estimation for Markov Decision Processes , 2008, J. Comput. Syst. Sci..
[20] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[21] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[22] Nick Montfort,et al. Racing the Beam: The Atari Video Computer System , 2009 .
[23] Yavar Naddaf,et al. Game-independent AI agents for playing Atari 2600 console games , 2010 .
[24] R. Sutton,et al. GQ(λ): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010 .
[25] Patrick M. Pilarski,et al. Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction , 2011, AAMAS.
[26] Marc G. Bellemare,et al. Investigating Contingency Awareness Using Atari 2600 Games , 2012, AAAI.
[27] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[28] Marc G. Bellemare,et al. Sketch-Based Linear Value Function Approximation , 2012, NIPS.
[29] Santiago Ontañón,et al. A Survey of Real-Time Strategy Game AI Research and Competition in StarCraft , 2013, IEEE Transactions on Computational Intelligence and AI in Games.
[30] Marc G. Bellemare,et al. Bayesian Learning of Recursively Factored Environments , 2013, ICML.
[31] Risto Miikkulainen,et al. General Video Game Playing , 2013, Artificial and Computational Intelligence in Games.
[32] Andrew G. Barto,et al. Intrinsic Motivation and Reinforcement Learning , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[33] Thore Graepel,et al. A Comparison of learning algorithms on the Arcade Learning Environment , 2014, ArXiv.
[34] Erik Talvitie,et al. Model Regularization for Stable Sample Rollouts , 2014, UAI.
[35] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[36] Risto Miikkulainen,et al. A Neuroevolution Approach to General Atari Game Playing , 2014, IEEE Transactions on Computational Intelligence and AI in Games.
[37] Marc G. Bellemare,et al. Skip Context Tree Switching , 2014, ICML.
[38] Bruno Bouchard,et al. Reports from the 2015 AAAI Workshop Program , 2015, AI Mag..
[39] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[40] Marc G. Bellemare,et al. Compress and Control , 2015, AAAI.
[41] Peter Stone,et al. The Impact of Determinism on Learning Atari 2600 Games , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[42] Marlos C. Machado,et al. Domain-Independent Optimistic Initialization for Reinforcement Learning , 2014, AAAI Workshop: Learning for General Competency in Video Games.
[43] Hector Geffner,et al. Classical Planning with Simulators: Results on the Atari Video Games , 2015, IJCAI.
[44] Martial Hebert,et al. Improving Multi-Step Prediction of Learned Time Series Models , 2015, AAAI.
[45] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[46] Elliot Meyerson,et al. Frame Skip Is a Powerful Parameter for Learning to Play Atari , 2015, AAAI Workshop: Learning for General Competency in Video Games.
[47] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[48] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[49] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[50] Razvan Pascanu,et al. Policy Distillation , 2015, ICLR.
[51] Ruslan Salakhutdinov,et al. Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning , 2015, ICLR.
[52] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[53] Marlos C. Machado,et al. State of the Art Control of Atari Games Using Shallow Reinforcement Learning , 2015, AAMAS.
[54] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[55] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[56] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[57] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[58] Shie Mannor,et al. Graying the black box: Understanding DQNs , 2016, ICML.
[59] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[60] Katja Hofmann,et al. The Malmo Platform for Artificial Intelligence Experimentation , 2016, IJCAI.
[61] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[62] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[63] Marc G. Bellemare,et al. Increasing the Action Gap: New Operators for Reinforcement Learning , 2015, AAAI.
[64] Carmel Domshlak,et al. Blind Search for Atari-Like Online Planning Revisited , 2016, IJCAI.
[65] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[66] Marcus Hutter,et al. Count-Based Exploration in Feature Space for Reinforcement Learning , 2017, IJCAI.
[67] Marc G. Bellemare,et al. Count-Based Exploration with Neural Density Models , 2017, ICML.
[68] Daniel Nikovski,et al. Value-Aware Loss Function for Model-based Reinforcement Learning , 2017, AISTATS.
[69] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[70] Malcolm I. Heywood,et al. Emergent Tangled Graph Representations for Atari Game Playing Agents , 2017, EuroGP.
[71] Tom Schaul,et al. Learning from Demonstrations for Real World Reinforcement Learning , 2017, ArXiv.
[72] Daan Wierstra,et al. Recurrent Environment Simulators , 2017, ICLR.
[73] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[74] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[75] Peter Henderson,et al. Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control , 2017, ArXiv.
[76] Alex S. Fukunaga,et al. Learning to Prune Dominated Action Sequences in Online Black-Box Planning , 2017, AAAI.
[77] Erik Talvitie,et al. Self-Correcting Models for Model-Based Reinforcement Learning , 2016, AAAI.
[78] Marlos C. Machado,et al. Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract) , 2018, IJCAI.
[79] John Schulman,et al. Gotta Learn Fast: A New Benchmark for Generalization in RL , 2018, ArXiv.
[80] IEEE Transactions on Computational Intelligence and AI in Games , 2018, IEEE Transactions on Games.
[81] Tom Schaul,et al. Deep Q-learning From Demonstrations , 2017, AAAI.