Learning and planning in structured worlds
暂无分享,去创建一个
[1] Mark A. Peot,et al. Postponing Threats in Partial-Order Planning , 1993, AAAI.
[2] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[3] Ronald A. Howard,et al. Dynamic Probabilistic Systems , 1971 .
[4] John L. Pollock,et al. The Logical Foundations of Goal-Regression Planning in Autonomous Agents , 1998, Artif. Intell..
[5] Wai Lam,et al. Using Causal Information and Local Measures to Learn Bayesian Networks , 1993, UAI.
[6] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[7] R. I. Bahar,et al. Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).
[8] Leslie Pack Kaelbling,et al. Toward Approximate Planning in Very Large Stochastic Domains , 1994, AAAI 1994.
[9] Stuart J. Russell,et al. Bayesian Q-Learning , 1998, AAAI/IAAI.
[10] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[11] P. W. Jones,et al. Bandit Problems, Sequential Allocation of Experiments , 1987 .
[12] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[13] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[14] Mark D. Johnston,et al. Scheduling with neural networks - the case of the hubble space telescope , 1992, Comput. Oper. Res..
[15] Gregory F. Cooper,et al. A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .
[16] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[17] Ronald L. Rivest,et al. Learning decision lists , 2004, Machine Learning.
[18] Ross D. Shachter. Evaluating Influence Diagrams , 1986, Oper. Res..
[19] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[20] Andrew G. Barto,et al. On the Computational Economics of Reinforcement Learning , 1991 .
[21] John H. Holland,et al. Escaping brittleness: the possibilities of general-purpose learning algorithms applied to parallel rule-based systems , 1995 .
[22] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[23] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[24] Moisés Goldszmidt,et al. Action Networks: A Framework for Reasoning about Actions and Change under Uncertainty , 1994, UAI.
[25] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[26] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..
[27] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.
[28] R. M. Oliver,et al. Influence diagrams, belief nets and decision analysis , 1992 .
[29] R. Bellman,et al. Dynamic Programming and Markov Processes , 1960 .
[30] David Heckerman,et al. A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.
[31] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.
[32] Kristian G. Olesen,et al. HUGIN - A Shell for Building Bayesian Belief Universes for Expert Systems , 1989, IJCAI.
[33] David A. McAllester,et al. Systematic Nonlinear Planning , 1991, AAAI.
[34] Clausin D. Hadley,et al. Some Theory of Sampling , 1950 .
[35] David Heckerman,et al. Probabilistic similarity networks , 1991, Networks.
[36] Leslie Pack Kaelbling,et al. Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..
[37] Craig Boutilier,et al. Correlated Action Effects in Decision Theoretic Regression , 1997, UAI.
[38] David Andre,et al. Generalized Prioritized Sweeping , 1997, NIPS.
[39] Brian Drabble. Mission scheduling for spacecraft: Diaries of T-SCHED , 1990, Expert Planning Systems.
[40] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[41] Craig Boutilier,et al. Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..
[42] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[43] Peter Haddawy,et al. Decision-theoretic Refinement Planning Using Inheritance Abstraction , 1994, AIPS.
[44] Ronald A. Howard,et al. Information Value Theory , 1966, IEEE Trans. Syst. Sci. Cybern..
[45] Robert Givan,et al. Bounded Parameter Markov Decision Processes , 1997, ECP.
[46] Richard William Dearden,et al. Abstraction and search for decision-theoretic planning , 1994 .
[47] Nir Friedman,et al. Learning Bayesian Networks with Local Structure , 1996, UAI.
[48] Thomas G. Dietterich,et al. Explanation-Based Learning and Reinforcement Learning: A Unified View , 1995, Machine-mediated learning.
[49] Yoram Singer,et al. Efficient Bayesian Parameter Estimation in Large Discrete Domains , 1998, NIPS.
[50] Ibrahim N. Hajj,et al. Parallel circuit simulation on supercomputers , 1989 .
[51] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[52] Ronald A. Howard,et al. Readings on the Principles and Applications of Decision Analysis , 1989 .
[53] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.
[54] Craig Boutilier,et al. Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..
[55] Andrew W. Moore,et al. Efficient memory-based learning for robot control , 1990 .
[56] Ronald L. Rivest,et al. Constructing Optimal Binary Decision Trees is NP-Complete , 1976, Inf. Process. Lett..
[57] Mark Stefik,et al. Planning and Meta-Planning (MOLGEN: Part 2) , 1981, Artif. Intell..
[58] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .
[59] Craig Boutilier,et al. Integrating Planning and Execution in Stochastic Domains , 1994, UAI.
[60] Peter Haddawy,et al. Toward Case-Based Preference Elicitation: Similarity Measures on Preference Structures , 1998, UAI.
[61] Ronald L. Rivest,et al. Inferring Decision Trees Using the Minimum Description Length Principle , 1989, Inf. Comput..
[62] Daniel S. Weld,et al. A Probablistic Model of Action for Least-Commitment Planning with Information Gathering , 1994, UAI.
[63] Claude-Nicolas Fiechter,et al. Design and analysis of efficient reinforcement learning algorithms , 1997 .
[64] J. Ross Quinlan,et al. C4.5: Programs for Machine Learning , 1992 .
[65] Leslie Pack Kaelbling,et al. Planning With Deadlines in Stochastic Domains , 1993, AAAI.
[66] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[67] Ronen I. Brafman,et al. Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.
[68] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.
[69] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[70] Peter Haddawy,et al. Abstracting Probabilistic Actions , 1994, UAI.
[71] F. Fairman. Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.
[72] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[73] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[74] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[75] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[76] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[77] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[78] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[79] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[80] Long Ji Lin,et al. Programming Robots Using Reinforcement Learning and Teaching , 1991, AAAI.
[81] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[82] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Least-Commitment Planning , 1994, AAAI.
[83] Heekuck Oh,et al. Neural Networks for Pattern Recognition , 1993, Adv. Comput..
[84] Brian D. Ripley,et al. Stochastic Simulation , 2005 .
[85] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[86] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[87] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..