Learning via human feedback in continuous state and action spaces
暂无分享,去创建一个
[1] Wei Zhang,et al. A Reinforcement Learning Approach to job-shop Scheduling , 1995, IJCAI.
[2] Peter Stone,et al. Reinforcement Learning with Human Feedback in Mountain Car , 2011, AAAI Spring Symposium: Help Me Help You: Bridging the Gaps in Human-Agent Collaboration.
[3] Oliver Kroemer,et al. Learning Continuous Grasp Affordances by Sensorimotor Exploration , 2010, From Motor Learning to Interaction Learning in Robots.
[4] Peter Stone,et al. Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework , 2010, AAMAS.
[5] Dimitri P. Bertsekas,et al. Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems , 1996, NIPS.
[6] Peter Stone,et al. Function Approximation via Tile Coding: Automating Parameter Choice , 2005, SARA.
[7] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[8] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[9] Peter Stone,et al. Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.
[10] Shalabh Bhatnagar,et al. Natural actor-critic algorithms , 2009, Autom..
[11] Jianghao Li,et al. Microassembly path planning using reinforcement learning for improving positioning accuracy of a 1 cm3 omni-directional mobile microrobot , 2011, Applied Intelligence.
[12] Peter Stone,et al. Combining manual feedback with subsequent MDP reward signals for reinforcement learning , 2010, AAMAS.
[13] Olivier Sigaud,et al. From Motor Learning to Interaction Learning in Robots , 2010, From Motor Learning to Interaction Learning in Robots.
[14] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[15] W. Bradley Knox and Peter Stone,et al. Reinforcement Learning with Human and MDP Reward , 2012 .
[16] Farbod Fahimi,et al. Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning , 2011, 2011 IEEE International Conference on Rehabilitation Robotics.
[17] Vijay R. Konda,et al. OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..
[18] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[19] Peter Stone,et al. Reinforcement learning from simultaneous human and MDP reward , 2012, AAMAS.
[20] Ole-Christoffer Granmo,et al. Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game , 2013, Applied Intelligence.
[21] Vittaldas V. Prabhu,et al. Distributed Reinforcement Learning Control for Batch Sequencing and Sizing in Just-In-Time Manufacturing Systems , 2004, Applied Intelligence.
[22] Gloria E. Phillips-Wren,et al. Innovations in agent collaboration, cooperation and Teaming, Part 2 , 2007, J. Netw. Comput. Appl..
[23] Oliver Kroemer,et al. Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..
[24] Paloma Martínez,et al. Learning teaching strategies in an Adaptive and Intelligent Educational System through Reinforcement Learning , 2009, Applied Intelligence.
[25] Jan Peters,et al. Policy Search for Motor Primitives in Robotics , 2008, NIPS 2008.
[26] Nguyen Hoang Viet,et al. Policy Gradient SMDP for Resource Allocation and Routing in Integrated Services Networks , 2008, 2008 IEEE International Conference on Networking, Sensing and Control.
[27] Andrea Lockerd Thomaz,et al. Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance , 2006, AAAI.
[28] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[29] Richard S. Sutton,et al. Generalization in ReinforcementLearning : Successful Examples UsingSparse Coarse , 1996 .
[30] K. Subramanian,et al. Learning Options through Human Interaction , 2011 .
[31] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[32] Bradley C. Love,et al. A New Experimental Perspective , 2012 .
[33] Andrew Tridgell,et al. Learning to Play Chess Using Temporal Differences , 2000, Machine Learning.
[34] Michael I. Jordan,et al. Learning Without State-Estimation in Partially Observable Markovian Decision Processes , 1994, ICML.
[35] Michael Wooldridge,et al. Agent-based software engineering , 1997, IEE Proc. Softw. Eng..
[36] Maziar Palhang,et al. Multi-criteria expertness based cooperative Q-learning , 2012, Applied Intelligence.
[37] TaeChoong Chung,et al. Hessian matrix distribution for Bayesian policy gradient reinforcement learning , 2011, Inf. Sci..
[38] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[39] Ian H. Witten,et al. An Adaptive Optimal Controller for Discrete-Time Markov Environments , 1977, Inf. Control..
[40] Manuel Lopes,et al. Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs , 2008, ECML/PKDD.
[41] G. Tesauro. Practical Issues in Temporal Difference Learning , 1992 .
[42] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[43] W. B. Knox. Augmenting Reinforcement Learning with Human Feedback , 2011 .
[44] Matthew E. Taylor,et al. Integrating Human Demonstration and Reinforcement Learning : Initial Results in Human-Agent Transfer , 2010 .
[45] Thomas G. Dietterich,et al. Reinforcement Learning Via Practice and Critique Advice , 2010, AAAI.
[46] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[47] M. Gabriel,et al. Learning and Computational Neuroscience: Foundations of Adaptive Networks , 1990 .
[48] P. Stone,et al. TAMER: Training an Agent Manually via Evaluative Reinforcement , 2008, 2008 7th IEEE International Conference on Development and Learning.
[49] Betty J. Mohler,et al. Imitation and Reinforcement Learning for Motor Primitives with Perceptual Coupling , 2010, From Motor Learning to Interaction Learning in Robots.
[50] Peter Dayan,et al. Technical Note: Q-Learning , 2004, Machine Learning.