暂无分享,去创建一个
[1] George B. Dantzig,et al. Decomposition Principle for Linear Programs , 1960 .
[2] Ronald A. Howard,et al. Dynamic Programming and Markov Processes , 1960 .
[3] Stephen J. Garland,et al. Algorithm 97: Shortest path , 1962, Commun. ACM.
[4] F. d'Epenoux,et al. A Probabilistic Production and Inventory Problem , 1963 .
[5] Karl Johan Åström,et al. Optimal control of Markov processes with incomplete state information , 1965 .
[6] Rutherford Aris,et al. Discrete Dynamic Programming , 1965, The Mathematical Gazette.
[7] P. Graefe. Linear stochastic systems , 1966 .
[8] Bennett L. Fox,et al. Scientific Applications: An algorithm for identifying the ergodic subchains and transient states of a stochastic matrix , 1967, Commun. ACM.
[9] D. J. White,et al. Decision Theory , 2018, Behavioral Finance for Private Banking.
[10] E. J. Sondik,et al. The Optimal Control of Partially Observable Markov Decision Processes. , 1971 .
[11] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.
[12] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..
[13] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[14] Earl D. Sacerdoti,et al. Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.
[15] E. Polak. Introduction to linear and nonlinear programming , 1973 .
[16] H. Kushner,et al. Decomposition of systems governed by Markov chains , 1974 .
[17] Earl D. Sacerdoti,et al. The Nonlinear Nature of Plans , 1975, IJCAI.
[18] David H. D. Warren,et al. Generating Conditional Plans and Programs , 1976, AISB.
[19] John B. Kidd,et al. Decisions with Multiple Objectives—Preferences and Value Tradeoffs , 1977 .
[20] M. Puterman,et al. Modified Policy Iteration Algorithms for Discounted Markov Decision Problems , 1978 .
[21] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..
[22] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[23] Introduction to dynamic systems: Theory, models and application , 1980, Proceedings of the IEEE.
[24] F. Fairman. Introduction to dynamic systems: Theory, models and applications , 1979, Proceedings of the IEEE.
[25] John McCarthy,et al. SOME PHILOSOPHICAL PROBLEMS FROM THE STANDPOINT OF ARTI CIAL INTELLIGENCE , 1987 .
[26] Jan Telgen,et al. Stochastic Dynamic Programming , 1982 .
[27] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..
[28] Paul J. Schweitzer,et al. Iterative Aggregation-Disaggregation Procedures for Discounted Semi-Markov Reward Processes , 1985, Oper. Res..
[29] Editors , 1986, Brain Research Bulletin.
[30] Randal E. Bryant,et al. Graph-Based Algorithms for Boolean Function Manipulation , 1986, IEEE Transactions on Computers.
[31] Ross D. Shachter. Evaluating Influence Diagrams , 1986, Oper. Res..
[32] Some philosophical problems from the standpoint of ai , 1987 .
[33] David Chapman,et al. Planning for Conjunctive Goals , 1987, Artif. Intell..
[34] J. Finger,et al. Exploiting constraints in design synthesis , 1987 .
[35] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.
[36] John N. Tsitsiklis,et al. The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..
[37] Chelsea C. White,et al. Solution Procedures for Partially Observed Markov Decision Processes , 1989, Oper. Res..
[38] Edwin P. D. Pednault,et al. ADL: Exploring the Middle Ground Between STRIPS and the Situation Calculus , 1989, KR.
[39] Judea Pearl,et al. Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.
[40] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[41] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .
[42] Richard E. Korf,et al. Real-Time Heuristic Search , 1990, Artif. Intell..
[43] Ross D. Shachter,et al. Dynamic programming and influence diagrams , 1990, IEEE Trans. Syst. Man Cybern..
[44] Drew V. Mcdermott,et al. Projecting plans for uncertain worlds , 1990 .
[45] Franz Josef Radermacher,et al. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (Judea Pearl) , 1990, SIAM Rev..
[46] David A. McAllester,et al. Systematic Nonlinear Planning , 1991, AAAI.
[47] Keith W. Ross,et al. Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach , 1991, Math. Oper. Res..
[48] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[49] Andrew B. Baker,et al. Nonmonotonic Reasoning in the Framework of Situation Calculus , 1991, Artif. Intell..
[50] David Heckerman,et al. Advances in Probabilistic Reasoning , 1994, Conference on Uncertainty in Artificial Intelligence.
[51] William S. Lovejoy,et al. Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..
[52] Qiang Yang,et al. Characterizing Abstraction Hierarchies for Planning , 1991, AAAI.
[53] W. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes , 1991 .
[54] Michael P. Wellman,et al. Planning and Control , 1991 .
[55] David Lee,et al. Online minimization of transition systems (extended abstract) , 1992, STOC '92.
[56] Sven Koenig,et al. Optimal Probabilistic and Decision-Theoretic Planning using Markovian , 1992 .
[57] Mark A. Peot,et al. Conditional nonlinear planning , 1992 .
[58] Daniel S. Weld,et al. UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.
[59] Oren Etzioni,et al. An Approach to Planning with Incomplete Information , 1992, KR.
[60] Uffe Kjærulff,et al. A Computational Scheme for Reasoning in Dynamic Probabilistic Networks , 1992, UAI.
[61] Solomon Eyal Shimony,et al. The role of relevance in explanation I: Irrelevance as statistical independence , 1993, Int. J. Approx. Reason..
[62] Leslie Pack Kaelbling,et al. Planning With Deadlines in Stochastic Domains , 1993, AAAI.
[63] Mark A. Peot,et al. Postponing Threats in Partial-Order Planning , 1993, AAAI.
[64] Craig A. Knoblock. Generating abstraction hierarchies - an automated approach to reducing search in planning , 1993, The Kluwer international series in engineering and computer science.
[65] Enrico Macii,et al. Algebraic decision diagrams and their applications , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).
[66] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[67] Craig Boutilier,et al. Using Abstractions for Decision-Theoretic Planning with Time Constraints , 1994, AAAI.
[68] Stuart J. Russell,et al. Control Strategies for a Stochastic Planner , 1994, AAAI.
[69] Jaime G. Carbonell,et al. Control Knowledge to Improve Plan Quality , 1994, AIPS.
[70] Daniel S. Weld,et al. A Probablistic Model of Action for Least-Commitment Planning with Information Gathering , 1994, UAI.
[71] Steve Hanks,et al. Optimal Planning with a Goal-directed Utility Model , 1994, AIPS.
[72] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[73] Leslie Pack Kaelbling,et al. Acting Optimally in Partially Observable Stochastic Domains , 1994, AAAI.
[74] Sridhar Mahadevan,et al. To Discount or Not to Discount in Reinforcement Learning: A Case Study Comparing R Learning and Q Learning , 1994, ICML.
[75] Peter Haddawy,et al. Decision-theoretic Refinement Planning Using Inheritance Abstraction , 1994, AIPS.
[76] Daniel S. Weld,et al. Probabilistic Planning with Information Gathering and Contingent Execution , 1994, AIPS.
[77] James A. Hendler,et al. Readings in Planning , 1994 .
[78] Robert P. Goldman,et al. Representing Uncertainty in Simple Planners , 1994, KR.
[79] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[80] Daniel S. Weld. An Introduction to Least Commitment Planning , 1994, AI Mag..
[81] Tom Bylander,et al. The Computational Complexity of Propositional STRIPS Planning , 1994, Artif. Intell..
[82] Peter Haddawy,et al. Abstracting Probabilistic Actions , 1994, UAI.
[83] Fangzhen Lin,et al. State Constraints Revisited , 1994, J. Log. Comput..
[84] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[85] Craig Boutilier,et al. Integrating Planning and Execution in Stochastic Domains , 1994, UAI.
[86] Michael I. Jordan,et al. Reinforcement Learning with Soft State Aggregation , 1994, NIPS.
[87] Drew McDermott,et al. Modeling a Dynamic and Uncertain World I: Symbolic and Probabilistic Reasoning About Change , 1994, Artif. Intell..
[88] Leslie Pack Kaelbling,et al. Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..
[89] Nicholas Kushmerick,et al. An Algorithm for Probabilistic Planning , 1995, Artif. Intell..
[90] Thomas G. Dietterich,et al. Explanation-Based Learning and Reinforcement Learning: A Unified View , 1995, Machine-mediated learning.
[91] Craig Boutilier,et al. Process-Oriented Planning and Average-Reward Optimality , 1995, IJCAI.
[92] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[93] Richard S. Sutton,et al. TD Models: Modeling the World at a Mixture of Time Scales , 1995, ICML.
[94] David Poole,et al. Exploiting the Rule Structure for Decision Making within the Independent Choice Logic , 1995, UAI.
[95] Avrim Blum,et al. Fast Planning Through Planning Graph Analysis , 1995, IJCAI.
[96] Thomas Dean,et al. Decomposition Techniques for Planning in Stochastic Domains , 1995, IJCAI.
[97] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning , 1995 .
[98] Reid G. Simmons,et al. Probabilistic Robot Navigation in Partially Observable Environments , 1995, IJCAI.
[99] Stuart J. Russell,et al. Approximating Optimal Policies for Partially Observable Stochastic Domains , 1995, IJCAI.
[100] Leslie Pack Kaelbling,et al. On the Complexity of Solving Markov Decision Problems , 1995, UAI.
[101] Reid G. Simmons,et al. Real-Time Search in Non-Deterministic Domains , 1995, IJCAI.
[102] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.
[103] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[104] Andrew McCallum,et al. Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State , 1995, ICML.
[105] Yiannis Aloimonos,et al. Artificial intelligence - theory and practice , 1995 .
[106] Nevin Lianwen Zhang,et al. Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..
[107] Craig Boutilier,et al. Context-Specific Independence in Bayesian Networks , 1996, UAI.
[108] Craig Boutilier,et al. The Frame Problem and Bayesian Network Action Representation , 1996, Canadian Conference on AI.
[109] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[110] 우종우,et al. [해외 연구소 소개]「The Institute for the Learning Sciences」 , 1996 .
[111] Craig Boutilier,et al. Computing Optimal Policies for Partially Observable Decision Processes Using Compact Representations , 1996, AAAI/IAAI, Vol. 2.
[112] Fahiem Bacchus,et al. Using temporal logic to control search in a forward chaining planner , 1996 .
[113] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[114] S. Hanks,et al. A value-directed approach to planning , 1996 .
[115] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[116] T. Dean,et al. Generating optimal policies for high-level plans with conditional branches and loops , 1996 .
[117] Craig Boutilier,et al. Approximate Value Trees in Structured Dynamic Programming , 1996, ICML.
[118] Craig Boutilier,et al. Rewarding Behaviors , 1996, AAAI/IAAI, Vol. 2.
[119] Rina Dechter,et al. Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.
[120] Robert Givan,et al. Model Minimization, Regression, and Propositional STRIPS Planning , 1997, IJCAI.
[121] Craig Boutilier,et al. Correlated Action Effects in Decision Theoretic Regression , 1997, UAI.
[122] Rina Dechter,et al. Mini-Buckets: A General Scheme for Generating Approximations in Automated Reasoning , 1997, IJCAI.
[123] Ronen I. Brafman,et al. Prioritized Goal Decomposition of Markov Decision Processes: Toward a Synthesis of Classical and Decision Theoretic Planning , 1997, IJCAI.
[124] Wenju Liu,et al. A Model Approximation Scheme for Planning in Partially Observable Stochastic Domains , 1997, J. Artif. Intell. Res..
[125] David Poole,et al. The Independent Choice Logic for Modelling Multiple Agents Under Uncertainty , 1997, Artif. Intell..
[126] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[127] Qiang Yang,et al. Intelligent planning - a decomposition and abstraction based approach , 1997, Artificial intelligence.
[128] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[129] Subbarao Kambhampati,et al. Refinement Planning as a Unifying Framework for Plan Synthesis , 1997, AI Mag..
[130] Milos Hauskrecht,et al. Planning and control in stochastic domains with imperfect information , 1997 .
[131] Craig Boutilier,et al. Abstraction and Approximate Decision-Theoretic Planning , 1997, Artif. Intell..
[132] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[133] Blai Bonet,et al. A Robust and Fast Action Selection Mechanism for Planning , 1997, AAAI/IAAI.
[134] David Poole,et al. Probabilistic Partial Evaluation: Exploiting Rule Structure in Probabilistic Inference , 1997, IJCAI.
[135] Michael L. Littman,et al. Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.
[136] Shieu-Hong Lin,et al. Exploiting structure for planning and control , 1997 .
[137] Craig Boutilier,et al. Structured Solution Methods for Non-Markovian Decision Processes , 1997, AAAI/IAAI.
[138] Robert Givan,et al. Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes , 1997, UAI.
[139] Michael L. Littman,et al. Probabilistic Propositional Planning: Representations and Complexity , 1997, AAAI/IAAI.
[140] Ronen I. Brafman,et al. A Heuristic Variable Grid Solution Method for POMDPs , 1997, AAAI/IAAI.
[141] Milos Hauskrecht,et al. Hierarchical Solution of Markov Decision Processes using Macro-actions , 1998, UAI.
[142] Doina Precup,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Options , 1998, ECML.
[143] Wolfram Burgard,et al. A Probabilistic Approach to Concurrent Mapping and Localization for Mobile Robots , 1998, Auton. Robots.
[144] Shlomo Zilberstein,et al. Heuristic Search in Cyclic AND/OR Graphs , 1998, AAAI/IAAI.
[145] R. Sutton,et al. Theoretical Results on Reinforcement Learning with Temporally Abstract Behaviors , 1998 .
[146] David Poole,et al. Context-specific approximation in probabilistic inference , 1998, UAI.
[147] Kee-Eung Kim,et al. Solving Stochastic Planning Problems with Large State and Action Spaces , 1998, AIPS.
[148] Kee-Eung Kim,et al. Solving Very Large Weakly Coupled Markov Decision Processes , 1998, AAAI/IAAI.
[149] Blai Bonet,et al. Learning Sorting and Decision Trees with POMDPs , 1998, ICML.
[150] Yee Whye Teh,et al. Making Forward Chaining Relevant , 1998, AIPS.
[151] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[152] Michael L. Littman,et al. The Computational Complexity of Probabilistic Planning , 1998, J. Artif. Intell. Res..
[153] Peter Haddawy,et al. Utility Models for Goal‐Directed, Decision‐Theoretic Planners , 1998, Comput. Intell..
[154] Ronen I. Brafman,et al. Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.
[155] Ronald Parr,et al. Flexible Decomposition Algorithms for Weakly Coupled Markov Decision Problems , 1998, UAI.
[156] Anne Condon,et al. On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.
[157] Jim Blythe,et al. Decision-Theoretic Planning , 1999, AI Mag..
[158] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.
[159] Jesse Hoey,et al. SPUDD: Stochastic Planning using Decision Diagrams , 1999, UAI.
[160] Robert Givan,et al. Bounded-parameter Markov decision processes , 2000, Artif. Intell..
[161] Craig Boutilier,et al. Stochastic dynamic programming with factored representations , 2000, Artif. Intell..
[162] Hiroaki Kitano,et al. RoboCup-98: Robot Soccer World Cup II , 2001, Lecture Notes in Computer Science.