Simple Local Models for Complex Dynamical Systems
暂无分享,去创建一个
[1] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[2] L. Baum,et al. A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .
[3] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[4] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[5] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[6] G. Monahan. State of the Art—A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 1982 .
[7] Ronald L. Rivest,et al. Diversity-based inference of finite automata , 1994, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).
[8] Kim G. Larsen,et al. Bisimulation through Probabilistic Testing , 1991, Inf. Comput..
[9] Yolanda Gil,et al. Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains , 1994, International Conference on Machine Learning.
[10] Xuemei Wang,et al. Learning by Observation and Practice: An Incremental Approach for Planning Operator Acquisition , 1995, ICML.
[11] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[12] Andrew McCallum,et al. Reinforcement learning with selective perception and hidden state , 1996 .
[13] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[14] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.
[15] Jeffrey K. Uhlmann,et al. New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.
[16] Xavier Boyen,et al. Tractable Inference for Complex Stochastic Processes , 1998, UAI.
[17] A. Cassandra,et al. Exact and approximate algorithms for partially observable markov decision processes , 1998 .
[18] Shlomo Zilberstein,et al. Finite-memory control of partially observable systems , 1998 .
[19] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[20] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[21] Geoffrey E. Hinton. Products of experts , 1999 .
[22] Herbert Jaeger,et al. Observable Operator Models for Discrete Stochastic Time Series , 2000, Neural Computation.
[23] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[24] Richard S. Sutton,et al. Predictive Representations of State , 2001, NIPS.
[25] Ben Taskar,et al. Learning Probabilistic Models of Relational Structure , 2001, ICML.
[26] Lex Weaver,et al. The Optimal Reward Baseline for Gradient-Based Reinforcement Learning , 2001, UAI.
[27] Doina Precup,et al. Learning Options in Reinforcement Learning , 2002, SARA.
[28] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[29] Robert Givan,et al. Equivalence notions and model minimization in Markov decision processes , 2003, Artif. Intell..
[30] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..
[31] Craig Boutilier,et al. Bounded Finite State Controllers , 2003, NIPS.
[32] Michael R. James,et al. Learning and discovery of predictive state representations in dynamical systems with reset , 2004, ICML.
[33] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[34] Michael R. James,et al. Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.
[35] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.
[36] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .
[37] Cosma Rohilla Shalizi,et al. Blind Construction of Optimal Nonlinear Recursive Predictors for Discrete Sequences , 2004, UAI.
[38] Matthew R. Rudary,et al. Predictive Linear-Gaussian Models of Stochastic Dynamical Systems , 2005, UAI.
[39] Eyal Amir,et al. Learning Partially Observable Deterministic Action Models , 2005, IJCAI.
[40] Michael R. James,et al. Learning predictive state representations in dynamical systems without reset , 2005, ICML.
[41] Charles Lee Isbell,et al. Looping suffix tree-based inference of partially observable hidden state , 2006, ICML.
[42] Olivier Sigaud,et al. Learning the structure of Factored Markov Decision Processes in reinforcement learning problems , 2006, ICML.
[43] Satinder P. Singh,et al. Predictive state representations with options , 2006, ICML.
[44] Michael H. Bowling,et al. Learning predictive state representations using non-blind policies , 2006, ICML '06.
[45] Alicia P. Wolfe,et al. Decision Tree Methods for Finding Reusable MDP Homomorphisms , 2006, AAAI.
[46] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[47] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.
[48] Michael L. Littman,et al. Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.
[49] Vishal Soni,et al. Abstraction in Predictive State Representations , 2007, AAAI.
[50] Satinder P. Singh,et al. Exponential Family Predictive Representations of State , 2007, NIPS.
[51] George E. Monahan,et al. A Survey of Partially Observable Markov Decision Processes: Theory, Models, and Algorithms , 2007 .
[52] Vishal Soni,et al. Relational Knowledge with Predictive State Representations , 2007, IJCAI.
[53] L. P. Kaelbling,et al. Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..
[54] Satinder P. Singh,et al. Efficiently learning linear-linear exponential family predictive representations of state , 2008, ICML '08.
[55] Doina Precup,et al. Bounding Performance Loss in Approximate MDP Homomorphisms , 2008, NIPS.
[56] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[57] B. Kuipers,et al. From pixels to policies: A bootstrapping agent , 2008, 2008 7th IEEE International Conference on Development and Learning.
[58] Erik Talvitie,et al. Building Incomplete but Accurate Models , 2008, ISAIM.
[59] Michael R. James,et al. Approximate predictive state representations , 2008, AAMAS.
[60] Britton D. Wolfe,et al. Modeling Dynamical Systems with Structured Predictive State Representations , 2009 .
[61] Satinder P. Singh,et al. Transfer via soft homomorphisms , 2009, AAMAS.
[62] Satinder Singh Baveja,et al. On predictive linear gaussian models , 2009 .
[63] Autonomously Learning an Action Hierarchy Using a Learned Qualitative State Representation , 2009, IJCAI.