The value of abstraction
暂无分享,去创建一个
Mark K. Ho | David Abel | Thomas L Griffiths | Michael L Littman | Mark K Ho | M. Littman | T. Griffiths | David Abel
[1] Marc G. Bellemare,et al. DeepMDP: Learning Continuous Latent Space Models for Representation Learning , 2019, ICML.
[2] Parag Singla,et al. ASAP-UCT: Abstraction of State-Action Pairs in UCT , 2015, IJCAI.
[3] Peter Dayan,et al. Improving Generalization for Temporal Difference Learning: The Successor Representation , 1993, Neural Computation.
[4] Pierre-Yves Oudeyer,et al. Intrinsic Motivation Systems for Autonomous Mental Development , 2007, IEEE Transactions on Evolutionary Computation.
[5] Marlos C. Machado,et al. A Laplacian Framework for Option Discovery in Reinforcement Learning , 2017, ICML.
[6] Tom Griffiths,et al. Representational efficiency outweighs action efficiency in human program induction , 2018, CogSci.
[7] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[8] Peter Stone,et al. The utility of temporal abstraction in reinforcement learning , 2008, AAMAS.
[9] D A Rosenbaum,et al. Hierarchical control of rapid movement sequences. , 1983, Journal of experimental psychology. Human perception and performance.
[10] Andrew G. Barto,et al. Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.
[11] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 1997, Texts in Computer Science.
[12] Michael L. Littman,et al. Transfer with Model Features in Reinforcement Learning , 2018, ArXiv.
[13] Noah D. Goodman,et al. Learning a theory of causality. , 2011, Psychological review.
[14] Joseph T. McGuire,et al. A Neural Signature of Hierarchical Reinforcement Learning , 2011, Neuron.
[15] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[16] Marlos C. Machado,et al. Eigenoption Discovery through the Deep Successor Representation , 2017, ICLR.
[17] Nan Jiang,et al. Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.
[18] A. P. Hyper-parameters. Count-Based Exploration with Neural Density Models , 2017 .
[19] Thomas L. Griffiths,et al. Rational Use of Cognitive Resources: Levels of Analysis Between the Computational and the Algorithmic , 2015, Top. Cogn. Sci..
[20] Honglak Lee,et al. Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.
[21] Parag Singla,et al. OGA-UCT: On-the-Go Abstractions in UCT , 2016, ICAPS.
[22] Alessandro Lazaric,et al. Regret Minimization in MDPs with Options without Prior Knowledge , 2017, NIPS.
[23] Thomas G. Dietterich,et al. State Aggregation in Monte Carlo Tree Search , 2014, AAAI.
[24] Zoran Popovic,et al. Efficient Bayesian Clustering for Reinforcement Learning , 2016, IJCAI.
[25] Tom Schaul,et al. The Predictron: End-To-End Learning and Planning , 2016, ICML.
[26] Simon M. Lucas,et al. A Survey of Monte Carlo Tree Search Methods , 2012, IEEE Transactions on Computational Intelligence and AI in Games.
[27] Shie Mannor,et al. Approximate Value Iteration with Temporally Extended Actions , 2015, J. Artif. Intell. Res..
[28] Marlos C. Machado,et al. The Eigenoption-Critic Framework , 2017, ArXiv.
[29] Stuart J. Russell. Rationality and Intelligence , 1995, IJCAI.
[30] Alessandro Lazaric,et al. Exploration – Exploitation in MDPs with Options , 2016 .
[31] Alec Solway,et al. Optimal Behavioral Hierarchy , 2014, PLoS Comput. Biol..
[32] Satinder Singh,et al. Value Prediction Network , 2017, NIPS.
[33] Craig A. Knoblock,et al. PDDL-the planning domain definition language , 1998 .
[34] Sridhar Mahadevan,et al. Proto-value functions: developmental reinforcement learning , 2005, ICML.
[35] Leslie Pack Kaelbling,et al. From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning , 2018, J. Artif. Intell. Res..
[36] Earl D. Sacerdott. Planning in a hierarchy of abstraction spaces , 1973, IJCAI 1973.
[37] Tom Schaul,et al. Successor Features for Transfer in Reinforcement Learning , 2016, NIPS.
[38] Rémi Coulom,et al. Efficient Selectivity and Backup Operators in Monte-Carlo Tree Search , 2006, Computers and Games.
[39] Shie Mannor,et al. Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations , 2014, ICML.
[40] Stewart W. Wilson,et al. A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers , 1991 .
[41] Bernard W. Balleine,et al. Actions, Action Sequences and Habits: Evidence That Goal-Directed and Habitual Action Control Are Hierarchically Organized , 2013, PLoS Comput. Biol..
[42] Alexei A. Efros,et al. Curiosity-Driven Exploration by Self-Supervised Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[43] L. A. Jeffress. Cerebral mechanisms in behavior : the Hixon symposium , 1951 .
[44] M. Botvinick. Hierarchical models of behavior and prefrontal function , 2008, Trends in Cognitive Sciences.
[45] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[46] A. Gordon,et al. Choosing between movement sequences: A hierarchical editor model. , 1984 .
[47] Ronald Ortner,et al. Noname manuscript No. (will be inserted by the editor) Adaptive Aggregation for Reinforcement Learning in Average Reward Markov Decision Processes , 2022 .
[48] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[49] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[50] K. Lashley. The problem of serial order in behavior , 1951 .
[51] Tom Bylander,et al. The Computational Complexity of Propositional STRIPS Planning , 1994, Artif. Intell..
[52] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[53] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[54] Per B. Sederberg,et al. The Successor Representation and Temporal Context , 2012, Neural Computation.
[55] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[56] Leslie Pack Kaelbling,et al. Constructing Symbolic Representations for High-Level Planning , 2014, AAAI.
[57] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[58] Doina Precup,et al. When Waiting is not an Option : Learning Options with a Deliberation Cost , 2017, AAAI.
[59] Marie desJardins,et al. Portable Option Discovery for Automated Learning Transfer in Object-Oriented Markov Decision Processes , 2015, IJCAI.
[60] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[61] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[62] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[63] Ari Weinstein,et al. Model-based hierarchical reinforcement learning and human action control , 2014, Philosophical Transactions of the Royal Society B: Biological Sciences.
[64] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[65] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[66] M. Botvinick,et al. The successor representation in human reinforcement learning , 2016, bioRxiv.
[67] N. Chater,et al. Simplicity: a unifying principle in cognitive science? , 2003, Trends in Cognitive Sciences.
[68] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[69] Marc G. Bellemare,et al. Approximate Exploration through State Abstraction , 2018, ArXiv.
[70] Nan Jiang,et al. Improving UCT planning via approximate homomorphisms , 2014, AAMAS.
[71] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.