Explanation-Based Learning and Reinforcement Learning: A Unified View
暂无分享,去创建一个
[1] Bill Broyles. Notes , 1907, The Classical Review.
[2] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.
[3] R. Bellman. Dynamic programming. , 1957, Science.
[4] Jon Doyle,et al. A Truth Maintenance System , 1979, Artif. Intell..
[5] H. Edelsbrunner. A new approach to rectangle intersections part I , 1983 .
[6] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .
[7] Russell H. Taylor,et al. Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .
[8] Michael R. Genesereth,et al. Logic programming , 1985, CACM.
[9] Ken Thompson,et al. Retrograde Analysis of Certain Endgames , 1986, J. Int. Comput. Games Assoc..
[10] Michael A. Erdmann,et al. Using Backprojections for Fine Motion Planning with Uncertainty , 1986 .
[11] Eric Horvitz,et al. Reasoning about beliefs and actions under computational resource constraints , 1987, Int. J. Approx. Reason..
[12] Jaime G. Carbonell,et al. Learning effective search control knowledge: an explanation-based approach , 1988 .
[13] Christopher G. Atkeson,et al. Using Local Models to Control Movement , 1989, NIPS.
[14] C. Watkins. Learning from delayed rewards , 1989 .
[15] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[16] Marshall Bern. Hidden Surface Removal for Rectangles , 1990, J. Comput. Syst. Sci..
[17] Paul E. Utgoff,et al. Explaining Temporal Differences to Create Useful Concepts for Evaluating States , 1990, AAAI.
[18] Devika Subramanian,et al. The Utility of EBL in Recursive Domain Theories , 1990, AAAI.
[19] Claude Sammut,et al. Is Learning Rate a Good Performance Criterion for Learning? , 1990, ML.
[20] Steven Minton,et al. Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..
[21] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[22] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[23] Charles L. Forgy,et al. Rete: a fast algorithm for the many pattern/many object pattern match problem , 1991 .
[24] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[25] Stuart J. Russell,et al. Principles of Metareasoning , 1989, Artif. Intell..
[26] Alan D. Christiansen. Learning to Predict in Uncertain Continuous Tasks , 1992, ML.
[27] N. Flann. Correct abstraction in counter-planning: a knowledge compilation approach , 1992 .
[28] Gerald Tesauro,et al. Practical Issues in Temporal Difference Learning , 1992, Mach. Learn..
[29] Raymond J. Mooney,et al. Combining FOIL and EBG to Speed-up Logic Programs , 1993, IJCAI.
[30] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[31] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[32] Thomas G. Dietterich,et al. Explanation-Based Learning and Reinforcement Learning: A Unified View , 1995, Machine-mediated learning.
[33] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.
[34] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[35] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[36] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[37] Yoram Singer,et al. A simple, fast, and effective rule learner , 1999, AAAI 1999.
[38] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[39] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[40] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[41] Tom M. Mitchell,et al. Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.
[42] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.
[43] A. Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.
[44] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[45] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[46] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[47] Pat Langley,et al. An architecture for persistent reactive behavior , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[48] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[49] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.
[50] Allen Newell,et al. The problem of expensive chunks and its solution by restricting expressiveness , 1993, Machine Learning.
[51] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[52] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.