Explanation-Based Learning and Reinforcement Learning: A Unified View
暂无分享,去创建一个
[1] Allen Newell,et al. The problem of expensive chunks and its solution by restricting expressiveness , 1993, Machine Learning.
[2] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[3] Pat Langley,et al. An architecture for persistent reactive behavior , 2004, Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004..
[4] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.
[5] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[6] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[7] Gerald Tesauro,et al. Practical issues in temporal difference learning , 1992, Machine Learning.
[8] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.
[9] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[10] Tom M. Mitchell,et al. Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.
[11] Allen Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.
[12] Robert Givan,et al. Relational Reinforcement Learning: An Overview , 2004, ICML 2004.
[13] Andrew W. Moore,et al. The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces , 2004, Machine Learning.
[14] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[15] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[16] Yoram Singer,et al. A simple, fast, and effective rule learner , 1999, AAAI 1999.
[17] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[18] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[19] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.
[20] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[22] Raymond J. Mooney,et al. Combining FOIL and EBG to Speed-up Logic Programs , 1993, IJCAI.
[23] Alan D. Christiansen. Learning to Predict in Uncertain Continuous Tasks , 1992, ML.
[24] N. Flann. Correct abstraction in counter-planning: a knowledge compilation approach , 1992 .
[25] Stuart J. Russell,et al. Do the right thing - studies in limited rationality , 1991 .
[26] Leslie Pack Kaelbling,et al. Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.
[27] Charles L. Forgy,et al. Rete: a fast algorithm for the many pattern/many object pattern match problem , 1991 .
[28] Stuart J. Russell,et al. Principles of Metareasoning , 1989, Artif. Intell..
[29] Devika Subramanian,et al. The Utility of EBL in Recursive Domain Theories , 1990, AAAI.
[30] Paul E. Utgoff,et al. Explaining Temporal Differences to Create Useful Concepts for Evaluating States , 1990, AAAI.
[31] Claude Sammut,et al. Is Learning Rate a Good Performance Criterion for Learning? , 1990, ML.
[32] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[33] D. Bertsekas,et al. Adaptive aggregation methods for infinite horizon dynamic programming , 1989 .
[34] C. Watkins. Learning from delayed rewards , 1989 .
[35] Christopher G. Atkeson,et al. Using Local Models to Control Movement , 1989, NIPS.
[36] Steven Minton,et al. Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..
[37] Marshall W. Bern,et al. Hidden surface removal for rectangles , 1988, SCG '88.
[38] Jaime G. Carbonell,et al. Learning effective search control knowledge: an explanation-based approach , 1988 .
[39] Eric Horvitz,et al. Reasoning about beliefs and actions under computational resource constraints , 1987, Int. J. Approx. Reason..
[40] Ken Thompson,et al. Retrograde Analysis of Certain Endgames , 1986, J. Int. Comput. Games Assoc..
[41] Michael R. Genesereth,et al. Logic programming , 1985, CACM.
[42] Michael A. Erdmann,et al. Using Backprojections for Fine Motion Planning with Uncertainty , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.
[43] Russell H. Taylor,et al. Automatic Synthesis of Fine-Motion Strategies for Robots , 1984 .
[44] H. Edelsbrunner. A new approach to rectangle intersections part I , 1983 .
[45] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .
[46] Jon Doyle,et al. A Truth Maintenance System , 1979, Artif. Intell..
[47] R. Bellman. Dynamic Programming , 1957, Science.
[48] Edsger W. Dijkstra,et al. A note on two problems in connexion with graphs , 1959, Numerische Mathematik.
[49] T. I. Cook. Administrative Behavior , 1950, The Review of Politics.
[50] Bill Broyles. Notes , 1907, The Classical Review.
[51] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.