Transfer Learning for Reinforcement Learning Domains: A Survey
暂无分享,去创建一个
[1] E. Thorndike,et al. The influence of improvement in one mental function upon the efficiency of other functions. (I). , 1901 .
[2] B. Skinner,et al. Science and human behavior , 1953 .
[3] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[4] R. Bellman. A PROBLEM IN THE SEQUENTIAL DESIGN OF EXPERIMENTS , 1954 .
[5] J. McCarthy. A Tough Nut for Proof Procedures , 1964 .
[6] R. Bellman. Dynamic programming. , 1957, Science.
[7] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[8] Robin Milner,et al. A Calculus of Communicating Systems , 1980, Lecture Notes in Computer Science.
[9] Richard S. Sutton,et al. Training and Tracking in Robotics , 1985, IJCAI.
[10] Allen Ginsberg,et al. Theory Revision via Prior Operationalization , 1988, AAAI.
[11] C. Watkins. Learning from delayed rewards , 1989 .
[12] Marco Colombetti,et al. Robot shaping: developing situated agents through learning , 1992 .
[13] Satinder Singh. Transfer of Learning by Composing Solutions of Elemental Sequential Tasks , 1992, Mach. Learn..
[14] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .
[15] Agnar Aamodt,et al. Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches , 1994, AI Commun..
[16] Rich Caruana,et al. Learning Many Related Tasks at the Same Time with Backpropagation , 1994, NIPS.
[17] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[18] Minoru Asada,et al. Vision-Based Behavior Acquisition For A Shooting Robot By Using A Reinforcement Learning , 1994 .
[19] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[20] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[21] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[22] Sebastian Thrun,et al. Is Learning The n-th Thing Any Easier Than Learning The First? , 1995, NIPS.
[23] William W. Cohen. Fast Effective Rule Induction , 1995, ICML.
[24] Andrew G. Barto,et al. Improving Elevator Performance Using Reinforcement Learning , 1995, NIPS.
[25] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.
[26] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[27] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[28] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.
[29] Christopher G. Atkeson,et al. A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.
[30] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[31] Manuela M. Veloso,et al. Bounding the Suboptimality of Reusing Subproblem , 1999, IJCAI.
[32] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[33] David Andre,et al. Model based Bayesian Exploration , 1999, UAI.
[34] M. Veloso,et al. Bounding the suboptimality of reusing subproblems , 1999, IJCAI 1999.
[35] Doina Precup,et al. Using Options for Knowledge Transfer in Reinforcement Learning , 1999 .
[36] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[37] Csaba Szepesvári,et al. An Evaluation Criterion for Macro-Learning and Some Results , 1999 .
[38] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[39] Manuela M. Veloso,et al. Multiagent Systems: A Survey from a Machine Learning Perspective , 2000, Auton. Robots.
[40] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[41] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[42] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[43] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[44] David Andre,et al. State abstraction for programmable reinforcement learning agents , 2002, AAAI/IAAI.
[45] Chris Drummond,et al. Accelerating Reinforcement Learning by Composing Solutions of Automatically Identified Subtasks , 2011, J. Artif. Intell. Res..
[46] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.
[47] Carlos Guestrin,et al. Generalizing plans to new environments in relational MDPs , 2003, IJCAI 2003.
[48] Masayuki Yamamura,et al. Multitask reinforcement learning on the distribution of MDPs , 2003, Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No.03EX694).
[49] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.
[50] Tanaka Fumihide,et al. Multitask Reinforcement Learning on the Distribution of MDPs , 2003 .
[51] C. Boutilier,et al. Accelerating Reinforcement Learning through Implicit Imitation , 2003, J. Artif. Intell. Res..
[52] Balaraman Ravindran,et al. Relativized Options: Choosing the Right Transformation , 2003, ICML.
[53] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[54] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[55] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[56] Satinder Singh. Transfer of learning by composing solutions of elemental sequential tasks , 2004, Machine Learning.
[57] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[58] Allen Newell,et al. Chunking in Soar: The anatomy of a general learning mechanism , 1985, Machine Learning.
[59] Darrin C. Bentivegna,et al. Learning From Observation and Practice Using Primitives , 2004 .
[60] Peter Dayan,et al. Structure in the Space of Value Functions , 2002, Machine Learning.
[61] Michael G. Madden,et al. Transfer of Experience Between Reinforcement Learning Environments with Progressive Difficulty , 2004, Artificial Intelligence Review.
[62] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .
[63] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 2002, Machine Learning.
[64] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[65] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[66] Pieter Abbeel,et al. Exploration and apprenticeship learning in reinforcement learning , 2005, ICML.
[67] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[68] Doina Precup,et al. Metrics for Markov Decision Processes with Infinite State Spaces , 2005, UAI.
[69] M. Veloso. Probabilistic Policy Reuse , 2005 .
[70] Gerhard Widmer,et al. Learning in the Presence of Concept Drift and Hidden Contexts , 1996, Machine Learning.
[71] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[72] Jude W. Shavlik,et al. Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another , 2005, ECML.
[73] J.L. Carroll,et al. Task similarity measures for transfer in reinforcement learning task libraries , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..
[74] David W. Aha,et al. Learning approximate preconditions for methods in hierarchical plans , 2005, ICML.
[75] Jude W. Shavlik,et al. Creating Advice-Taking Reinforcement Learners , 1998, Machine Learning.
[76] Jude W. Shavlik,et al. Giving Advice about Preferred Actions to Reinforcement Learners Via Knowledge-Based Kernel Regression , 2005, AAAI.
[77] Peter Stone,et al. Improving Action Selection in MDP's via Knowledge Transfer , 2005, AAAI.
[78] Samarth Swarup,et al. Cross-Domain Knowledge Transfer Using Structured Representations , 2006, AAAI.
[79] Doina Precup,et al. Methods for Computing State Similarity in Markov Decision Processes , 2006, UAI.
[80] Andrew G. Barto,et al. An intrinsic reward mechanism for efficient exploration , 2006, ICML.
[81] Andrew G. Barto,et al. Autonomous shaping: knowledge transfer in reinforcement learning , 2006, ICML.
[82] Liming Xiang,et al. Kernel-Based Reinforcement Learning , 2006, ICIC.
[83] Funlade T. Sunmola. Model Transfer for Markov Decision Tasks via Parameter Matching , 2006 .
[84] Doina Precup,et al. Knowledge Transfer in Markov Decision Processes , 2006 .
[85] Massimiliano Pontil,et al. Best Of NIPS 2005: Highlights on the 'Inductive Transfer : 10 Years Later' Workshop , 2006 .
[86] Manuela M. Veloso,et al. Probabilistic policy reuse in a reinforcement learning agent , 2006, AAMAS '06.
[87] S. Mahadevan,et al. Proto-transfer Learning in Markov Decision Processes Using Spectral Methods , 2006 .
[88] Vishal Soni,et al. Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.
[89] Thomas J. Walsh,et al. Towards a Unified Theory of State Abstraction for MDPs , 2006, AI&M.
[90] Peter Stone,et al. Autonomous Learning of Stable Quadruped Locomotion , 2006, RoboCup.
[91] Jude W. Shavlik,et al. Skill Acquisition Via Transfer Learning and Advice Taking , 2006, ECML.
[92] Peter Stone,et al. Value-Function-Based Transfer for Reinforcement Learning Using Structure Mapping , 2006, AAAI.
[93] Thomas J. Walsh. Transferring State Abstractions Between MDPs , 2006 .
[94] Maurice Bruynooghe,et al. Learning Relational Options for Inductive Transfer in Relational Reinforcement Learning , 2007, ILP.
[95] Andrew G. Barto,et al. Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.
[96] Bikramjit Banerjee,et al. General Game Learning Using Knowledge Transfer , 2007, IJCAI.
[97] Peter Stone,et al. Representation Transfer for Reinforcement Learning , 2007, AAAI Fall Symposium: Computational Approaches to Representation Change during Learning and Development.
[98] Manfred Huber,et al. Effective Control Knowledge Transfer through Learning Skill and Representation Hierarchies , 2007, IJCAI.
[99] Robert E. Schapire,et al. A Game-Theoretic Approach to Apprenticeship Learning , 2007, NIPS.
[100] Erik Talvitie,et al. An Experts Algorithm for Transfer Learning , 2007, IJCAI.
[101] Leslie Pack Kaelbling,et al. Efficient Bayesian Task-Level Transfer Learning , 2007, IJCAI.
[102] Jude W. Shavlik,et al. Relational Macros for Transfer in Reinforcement Learning , 2007, ILP.
[103] Peter Stone,et al. Model-Based Exploration in Continuous State Spaces , 2007, SARA.
[104] Kurt Driessens,et al. Transfer Learning in Reinforcement Learning Problems Through Partial Policy Recycling , 2007, ECML.
[105] Pat Langley,et al. Structural Transfer of Cognitive Skills , 2007 .
[106] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[107] Shai Ben-David,et al. A notion of task relatedness yielding provable multiple-task learning guarantees , 2008, Machine Learning.
[108] Shimon Whiteson,et al. Transfer via inter-task mappings in policy search reinforcement learning , 2007, AAMAS '07.
[109] Peter Stone,et al. Cross-domain transfer for reinforcement learning , 2007, ICML '07.
[110] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[111] Peter Stone,et al. Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..
[112] Michael L. Littman,et al. Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.
[113] Ashwin Ram,et al. Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.
[114] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.
[115] Peter Stone,et al. Graph-Based Domain Mapping for Transfer Learning in General Games , 2007, ECML.
[116] Sridhar Mahadevan,et al. Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes , 2007, J. Mach. Learn. Res..
[117] Raymond J. Mooney,et al. Transfer Learning by Mapping with Minimal Target Data , 2008 .
[118] Sriraam Natarajan,et al. Transfer in variable-reward hierarchical reinforcement learning , 2008, Machine Learning.
[119] W.D. Smart,et al. What does shaping mean for computational reinforcement learning? , 2008, 2008 7th IEEE International Conference on Development and Learning.
[120] Peter Stone,et al. Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.
[121] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.
[122] P. Schrimpf,et al. Dynamic Programming , 2011 .
[123] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .
[124] De,et al. Relational Reinforcement Learning , 2001, Encyclopedia of Machine Learning and Data Mining.