论文信息 - Explanation-Based Learning and Reinforcement Learning: A Unified View - 字舞流文

Explanation-Based Learning and Reinforcement Learning: A Unified View

Thomas G. Dietterich | Nicholas S. Flann | N. Flann

[1] Allen Newell,et al. The problem of expensive chunks and its solution by restricting expressiveness , 1993, Machine Learning.

[2] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[3] Pat Langley,et al. An architecture for persistent reactive behavior , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..

[4] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.

[5] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.

[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[7] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.

[8] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.

[9] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.

[10] Tom M. Mitchell,et al. Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.

[11] Allen Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.

[12] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.

[13] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.

[14] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.

[15] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..

[16] Yoram Singer,et al. A simple, fast, and effective rule learner , 1999, AAAI 1999.

[17] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.

[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[19] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.

[20] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[22] Raymond J. Mooney,et al. Combining FOIL and EBG to Speed-up Logic Programs , 1993, IJCAI.

[23] Alan D. Christiansen. Learning to Predict in Uncertain Continuous Tasks , 1992, ML.

[24] N. Flann. Correct abstraction in counter-planning: a knowledge compilation approach , 1992 .

[25] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .

[26] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[27] Charles L. Forgy,et al. Rete: a fast algorithm for the many pattern/many object pattern match problem , 1991 .

[28] Stuart J. Russell,et al. Principles of Metareasoning , 1989, Artif. Intell..

[29] Devika Subramanian,et al. The Utility of EBL in Recursive Domain Theories , 1990, AAAI.

[30] Paul E. Utgoff,et al. Explaining Temporal Differences to Create Useful Concepts for Evaluating States , 1990, AAAI.

[31] Claude Sammut,et al. Is Learning Rate a Good Performance Criterion for Learning? , 1990, ML.

[32] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .

[33] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .

[34] C. Watkins. Learning from delayed rewards , 1989 .

[35] Christopher G. Atkeson,et al. Using Local Models to Control Movement , 1989, NIPS.

[36] Steven Minton,et al. Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..

[37] Marshall W. Bern,et al. Hidden surface removal for rectangles , 1988, SCG '88.

[38] Jaime G. Carbonell,et al. Learning effective search control knowledge: an explanation-based approach , 1988 .

[39] Eric Horvitz,et al. Reasoning about beliefs and actions under computational resource constraints , 1987, Int. J. Approx. Reason..

[40] Ken Thompson,et al. Retrograde Analysis of Certain Endgames , 1986, J. Int. Comput. Games Assoc..

[41] Michael R. Genesereth,et al. Logic programming , 1985, CACM.

[42] Michael A. Erdmann,et al. Using Backprojections for Fine Motion Planning with Uncertainty , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[43] Russell H. Taylor,et al. Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .

[44] H. Edelsbrunner. A new approach to rectangle intersections part I , 1983 .

[45] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[46] Jon Doyle,et al. A Truth Maintenance System , 1979, Artif. Intell..

[47] R. Bellman. Dynamic Programming , 1957, Science.

[48] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[49] T. I. Cook. Administrative Behavior , 1950, The Review of Politics.

[50] Bill Broyles. Notes , 1907, The Classical Review.

[51] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.