Module-Based Reinforcement Learning: Experiments with a Real Robot
暂无分享,去创建一个
András Lörincz | Csaba Szepesvári | Zsolt Kalmár | Csaba Szepesvari | A. Lörincz | Z. Kalmár | András Lörincz
[1] G. Pólya,et al. How to Solve It , 1945 .
[2] R. Bellman. Dynamic programming. , 1957, Science.
[3] Allen Newell,et al. Human Problem Solving. , 1973 .
[4] Earl D. Sacerdott. Planning in a hierarchy of abstraction spaces , 1973, IJCAI 1973.
[5] J. Zabczyk. Optimal control by means switchings , 1973 .
[6] M. I. Henig. Vector-Valued Dynamic Programming , 1983 .
[7] R. Korf. Learning to solve problems by searching for macro-operators , 1983 .
[8] Richard S. Sutton,et al. Temporal credit assignment in reinforcement learning , 1984 .
[9] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..
[10] Patchigolla Kiran Kumar,et al. A Survey of Some Results in Stochastic Adaptive Control , 1985 .
[11] 中園 薫. A Qualitative Physics Based on Confluences , 1986 .
[12] Richard E. Korf,et al. Planning as Search: A Quantitative Approach , 1987, Artif. Intell..
[13] Rodney A. Brooks,et al. Learning to Coordinate Behaviors , 1990, AAAI.
[14] Rodney A. Brooks,et al. Elephants don't play chess , 1990, Robotics Auton. Syst..
[15] Pattie Maes,et al. A bottom-up mechanism for behavior selection in an artificial creature , 1991 .
[16] John R. Koza,et al. Automatic Programming of Robots Using Genetic Programming , 1992, AAAI.
[17] Rodney A. Brooks,et al. Artificial Life and Real Robots , 1992 .
[18] Sebastian Thrun,et al. The role of exploration in learning control , 1992 .
[19] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[20] Lonnie Chrisman,et al. Reinforcement Learning with Perceptual Aliasing: The Perceptual Distinctions Approach , 1992, AAAI.
[21] Satinder P. Singh,et al. Reinforcement Learning with a Hierarchy of Abstract Models , 1992, AAAI.
[22] Sven Koenig,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1992, AAAI.
[23] Roger W. Brockett,et al. Hybrid Models for Motion Control Systems , 1993 .
[24] András Lörincz,et al. Behavior of an Adaptive Self-organizing Autonomous Agent Working with Cues and Competing Concepts , 1993, Adapt. Behav..
[25] Toby Tyrrell,et al. Computational mechanisms for action selection , 1993 .
[26] Reid G. Simmons,et al. Complexity Analysis of Real-Time Reinforcement Learning , 1993, AAAI.
[27] Robert L. Grossman,et al. Timed Automata , 1999, CAV.
[28] Andrew McCallum,et al. Overcoming Incomplete Perception with Utile Distinction Memory , 1993, ICML.
[29] C. Szepesvari. Dynamic concept model learns optimal policies , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[30] V. Borkar,et al. A unified framework for hybrid control : b background, model, and theory , 1994 .
[31] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[32] Z. Kalmar,et al. Generalization in an autonomous agent , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[33] Marco Colombetti,et al. Robot Shaping: Developing Autonomous Agents Through Learning , 1994, Artif. Intell..
[34] John Lygeros,et al. Hierarchical Hybrid Control: A Case Study , 1994, Hybrid Systems.
[35] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[36] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..
[37] Sebastian Thrun,et al. Finding Structure in Reinforcement Learning , 1994, NIPS.
[38] M. Dorigo. ALECSYS and the AutonoMouse: Learning to Control a Real Robot by Distributed Classifier Systems , 1995, Machine Learning.
[39] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[40] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..
[41] Michael S. Branicky,et al. Studies in hybrid systems: modeling, analysis, and control , 1996 .
[42] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[43] Csaba Szepesvári,et al. A Generalized Reinforcement-Learning Model: Convergence and Applications , 1996, ICML.
[44] Selahattin Kuru,et al. Qualitative System Identification: Deriving Structure from Behavior , 1996, Artif. Intell..
[45] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[46] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[47] Matthias Heger. The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks , 1996, Machine Learning.
[48] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[49] Minoru Asada,et al. Behavior coordination for a mobile robot using modular reinforcement learning , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.
[50] Marco Colombetti,et al. Behavior analysis and training-a methodology for behavior engineering , 1996, IEEE Trans. Syst. Man Cybern. Part B.
[51] Csaba Szepesv Ari,et al. Generalized Markov Decision Processes: Dynamic-programming and Reinforcement-learning Algorithms , 1996 .
[52] Rémi Munos,et al. Finite-Element Methods with Local Triangulation Refinement for Continuous Reimforcement Learning Problems , 1997, ECML.
[53] Csaba Szepesvári,et al. Learning and Exploitation Do Not Conflict Under Minimax Optimality , 1997, ECML.
[54] Satinder P. Singh,et al. How to Dynamically Merge Markov Decision Processes , 1997, NIPS.
[55] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[56] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[57] John Lygeros,et al. A Design Framework For Hierarchical, Hybrid Control , 1997 .
[58] Ronen I. Brafman,et al. Modeling Agents as Qualitative Decision Makers , 1997, Artif. Intell..
[59] Csaba Szepesvari,et al. Module Based Reinforcement Learning for a Real Robot , 1997 .
[60] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[61] Csaba Szepesvári,et al. Multi-criteria Reinforcement Learning , 1998, ICML.
[62] Csaba Szepesvari. Static and Dynamic Aspects of Optimal Sequential Decision Making , 1998 .
[63] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[64] Minoru Asada,et al. Purposive behavior acquisition for a real robot by vision-based reinforcement learning , 1995, Machine Learning.
[65] M. Heger. The loss from imperfect value functions in expectation-based and minimax-based tasks , 2004, Machine Learning.
[66] András Lörincz,et al. Genetic algorithm with alphabet optimization , 1995, Biological Cybernetics.
[67] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[68] Sean R Eddy,et al. What is dynamic programming? , 2004, Nature Biotechnology.
[69] C. W. Tate. Solve it. , 2005, Nursing standard (Royal College of Nursing (Great Britain) : 1987).
[70] Rmi Munos. Finite-Element Methods with Local Triangulation Refinement for Continuous Reinforcement Learning Problems , 2005 .
[71] Minoru Asada,et al. Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.
[72] Thomas G. Dietterich,et al. Purposive Behavior Acqui- Sition for a Real Robot by Vision-based Reinforcement Learning. a Mulit- Agent Architecture Integrating Learning and Fuzzy Techniques for Landmark-based , .