An algebraic approach to abstraction in reinforcement learning
暂无分享,去创建一个
[1] J. Piaget. The construction of reality in the child , 1954 .
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] John G. Kemeny,et al. Finite Markov Chains. , 1960 .
[4] J. Hartmanis,et al. Algebraic Structure Theory Of Sequential Machines , 1966 .
[5] E. Denardo. CONTRACTION MAPPINGS IN THE THEORY UNDERLYING DYNAMIC PROGRAMMING , 1967 .
[6] Selby H. Evans,et al. A brief statement of schema theory , 1967 .
[7] Saul Amarel,et al. On representations of problems of reasoning about actions , 1968 .
[8] J. Robert Jump,et al. A Note on the Iterative Decomposition of Finite Automata , 1969, Inf. Control..
[9] J. Piaget,et al. The Origins of Intelligence in Children , 1971 .
[10] Azaria Paz,et al. Introduction to Probabilistic Automata , 1971 .
[11] J. K. Satia,et al. Markovian Decision Processes with Uncertain Transition Probabilities , 1973, Oper. Res..
[12] R. Schmidt. A schema theory of discrete motor skill learning. , 1975 .
[13] Ward Whitt,et al. Approximations of Dynamic Programs, I , 1978, Math. Oper. Res..
[14] W. Klein,et al. Speech, place, and action : studies in deixis and related topics , 1982 .
[15] S. Ullman. Visual routines , 1984, Cognition.
[16] Robin Milner,et al. Algebraic laws for nondeterminism and concurrency , 1985, JACM.
[17] Michael A. Arbib,et al. Schema theory , 1998 .
[18] Chelsea C. White,et al. Parameter Imprecision in Finite State, Finite Action Dynamic Programs , 1986, Oper. Res..
[19] Dimitri P. Bertsekas,et al. Dynamic Programming: Deterministic and Stochastic Models , 1987 .
[20] David Chapman,et al. Pengi: An Implementation of a Theory of Activity , 1987, AAAI.
[21] Philip E. Agre,et al. The dynamic structure of everyday life , 1988 .
[22] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[23] Michael A. Arbib,et al. A formal model of computation for sensory-based robotics , 1989, IEEE Trans. Robotics Autom..
[24] Craig A. Knoblock. Learning Abstraction Hierarchies for Problem Solving , 1990, AAAI.
[25] David Chapman,et al. Vision, instruction, and action , 1990 .
[26] J. Glover. Symmetry Groups and Translation Invariant Representations of Markov Processes , 1991 .
[27] Kim G. Larsen,et al. Bisimulation through Probabilistic Testing , 1991, Inf. Comput..
[28] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[29] Hilary Buxton,et al. Selective Attention in Dynamic Vision , 1993, IJCAI.
[30] David L. Dill,et al. Better verification through symmetry , 1996, Formal Methods Syst. Des..
[31] Craig Boutilier,et al. Using Abstractions for Decision-Theoretic Planning with Time Constraints , 1994, AAAI.
[32] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[33] Chelsea C. White,et al. Markov Decision Processes with Imprecise Transition Probabilities , 1994, Oper. Res..
[34] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[35] Michael O. Duff,et al. Reinforcement Learning Methods for Continuous-Time Markov Decision Problems , 1994, NIPS.
[36] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[37] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[38] Long Ji Lin,et al. Reinforcement Learning of Non-Markov Decision Processes , 1995, Artif. Intell..
[39] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[40] Pattie Maes,et al. Emergent Hierarchical Control Structures: Learning Reactive/Hierarchical Relationships in Reinforcement Environments , 1996 .
[41] Sérgio Vale Aguiar Campos,et al. Symbolic Model Checking , 1993, CAV.
[42] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[43] A. Prasad Sistla,et al. Symmetry and model checking , 1996, Formal Methods Syst. Des..
[44] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[45] Robert Givan,et al. Model Minimization, Regression, and Propositional STRIPS Planning , 1997, IJCAI.
[46] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[47] Michael E. Cleary,et al. Systematic use of deictic commands for mobile robot navigation , 1997 .
[48] A. Prasad Sistla,et al. Utilizing symmetry when model-checking under fairness assumptions: an automata-theoretic approach , 1997, TOPL.
[49] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[50] J. A. Coelho,et al. A Control Basis for Learning Multifingered Grasps , 1997 .
[51] Rajesh P. N. Rao,et al. Embodiment is the foundation, not a level , 1996, Behavioral and Brain Sciences.
[52] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[53] Somesh Jha,et al. Combining Partial Order and Symmetry Reductions , 1997, TACAS.
[54] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[55] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[56] Ronald E. Parr,et al. Hierarchical control and learning for markov decision processes , 1998 .
[57] Bruce L. Digney,et al. Learning hierarchical control structures for multiple tasks and changing environments , 1998 .
[58] Chris Drummond,et al. Composing Functions to Speed up Reinforcement Learning in a Changing World , 1998, ECML.
[59] Avi Pfeffer,et al. Probabilistic Frame-Based Systems , 1998, AAAI/IAAI.
[60] E. Allen Emerson,et al. Model Checking Real-Time Properties of Symmetric Systems , 1998, MFCS.
[61] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[62] Thomas G. Dietterich. State Abstraction in MAXQ Hierarchical Reinforcement Learning , 1999, NIPS.
[63] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[64] Lise Getoor,et al. Learning Probabilistic Relational Models , 1999, IJCAI.
[65] Roderic A. Grupen,et al. A Hybrid Architecture for Learning Robot Control Tasks , 1999 .
[66] E. Allen Emerson,et al. From Asymmetry to Full Symmetry: New Techniques for Symmetry Reduction in Model Checking , 1999, CHARME.
[67] Thomas G. Dietterich. An Overview of MAXQ Hierarchical Reinforcement Learning , 2000, SARA.
[68] Daphne Koller,et al. Policy Iteration for Factored MDPs , 2000, UAI.
[69] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[70] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[71] Roderic A. Grupen,et al. Symmetries in World Geometry and Adaptive System Behaviour , 2000, AFPAC.
[72] Sridhar Mahadevan,et al. Hierarchical Memory-Based Reinforcement Learning , 2000, NIPS.
[73] Doina Precup,et al. Temporal abstraction in reinforcement learning , 2000, ICML 2000.
[74] Andrew G. Barto,et al. Automated State Abstraction for Options using the U-Tree Algorithm , 2000, NIPS.
[75] David Andre,et al. Programmable Reinforcement Learning Agents , 2000, NIPS.
[76] Roderic A. Grupen,et al. A hybrid architecture for adaptive robot control , 2000 .
[77] Balaraman Ravindran,et al. Symmetries and Model Minimization in Markov Decision Processes , 2001 .
[78] Andrew G. Barto,et al. Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.
[79] Tucker R. Balch,et al. Symmetry in Markov Decision Processes and its Implications for Single Agent and Multiagent Learning , 2001, ICML.
[80] Sridhar Mahadevan,et al. A reinforcement learning model of selective visual attention , 2001, AGENTS '01.
[81] Kee-Eung Kim,et al. Solving Factored MDPs via Non-Homogeneous Partitioning , 2001, IJCAI.
[82] Craig Boutilier,et al. Symbolic Dynamic Programming for First-Order MDPs , 2001, IJCAI.
[83] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[84] Bernhard Hengst,et al. Discovering Hierarchy in Reinforcement Learning with HEXQ , 2002, ICML.
[85] Tim Oates,et al. The Thing that we Tried Didn't Work very Well: Deictic Representation in Reinforcement Learning , 2002, UAI.
[86] Andrew G. Barto,et al. PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning , 2002, ICML.
[87] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.
[88] Balaraman Ravindran,et al. SMDP Homomorphisms: An Algebraic Approach to Abstraction in Semi-Markov Decision Processes , 2003, IJCAI.
[89] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[90] Robert Platt,et al. Extending fingertip grasping to whole body grasping , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[91] Shlomo Zilberstein,et al. Symbolic Generalization for On-line Planning , 2002, UAI.
[92] Robert Platt,et al. Manipulation gaits: sequences of grasp control tasks , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[93] Glenn A. Iba,et al. A Heuristic Approach to the Discovery of Macro-Operators , 1989, Machine Learning.
[94] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[95] Prashant Shenoy,et al. Active QoS Flow Maintenance in Robotic , Mobile , Ad Hoc Networks , 2004 .
[96] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[97] Dana H. Ballard,et al. Learning to perceive and act by trial and error , 1991, Machine Learning.
[98] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[99] B. Nordstrom. FINITE MARKOV CHAINS , 2005 .