Reinforcement learning in robotics: A survey
暂无分享,去创建一个
[1] R. E. Kalman,et al. When Is a Linear Control System Optimal , 1964 .
[2] Richard Bellman,et al. Introduction to the mathematical theory of control processes , 1967 .
[3] J. T. O'Hanlan. The Fosbury flop. , 1968, Virginia medical monthly.
[4] David Q. Mayne,et al. Differential dynamic programming , 1972, The Mathematical Gazette.
[5] R. L. Keeney,et al. Decisions with Multiple Objectives: Preferences and Value Trade-Offs , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[6] Suguru Arimoto,et al. Bettering operation of Robots by learning , 1984, J. Field Robotics.
[7] Peter W. Glynn,et al. Likelilood ratio gradient estimation: an overview , 1987, WSC '87.
[8] Anil V. Rao,et al. Practical Methods for Optimal Control Using Nonlinear Programming , 1987 .
[9] Karl Johan Åström,et al. Adaptive Control , 1989, Embedded Digital Control with Microcontrollers.
[10] Mitsuo Kawato,et al. Feedback-Error-Learning Neural Network for Supervised Motor Learning , 1990 .
[11] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.
[12] R.J. Williams,et al. Reinforcement learning is direct adaptive optimal control , 1991, IEEE Control Systems.
[13] Oliver G. Selfridge,et al. Real-time learning: a ball on a beam , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[14] Marco Colombetti,et al. Robot shaping: developing situated agents through learning , 1992 .
[15] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..
[16] Leslie Pack Kaelbling,et al. Learning in embedded systems , 1993 .
[17] Christopher G. Atkeson,et al. Using Local Trajectory Optimizers to Speed Up Global Optimization in Dynamic Programming , 1993, NIPS.
[18] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[19] Leemon C Baird,et al. Reinforcement Learning With High-Dimensional, Continuous Actions , 1993 .
[20] F. B. Vernadat,et al. Decisions with Multiple Objectives: Preferences and Value Tradeoffs , 1994 .
[21] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[22] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[23] Maja J. Mataric,et al. Reward Functions for Accelerated Learning , 1994, ICML.
[24] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[25] S. Schaal,et al. Robot juggling: implementation of memory-based learning , 1994, IEEE Control Systems.
[26] Prasad Tadepalli,et al. H-Learning: A Reinforcement Learning Method for Optimizing Undiscounted Average Reward , 1994 .
[27] John Rust. Using Randomization to Break the Curse of Dimensionality , 1997 .
[28] V. Gullapalli,et al. Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.
[29] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1995 .
[30] Sebastian Thrun,et al. An approach to learning mobile robot navigation , 1995, Robotics Auton. Syst..
[31] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[32] Inman Harvey,et al. Noise and the Reality Gap: The Use of Simulation in Evolutionary Robotics , 1995, ECAL.
[33] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[34] S. Schaal,et al. A Kendama Learning Robot Based on Bi-directional Theory , 1996, Neural Networks.
[35] George A. Bekey,et al. Rapid Reinforcement Learning for Reactive Control Policy Design for Autonomous Robots , 1996 .
[36] Jean-Arcady Meyer,et al. Learning reactive and planning rules in a motivationally autonomous animat , 1996, IEEE Trans. Syst. Man Cybern. Part B.
[37] B. Pasik-Duncan,et al. Adaptive Control , 1996, IEEE Control Systems.
[38] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[39] Jeff G. Schneider,et al. Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning , 1996, NIPS.
[40] Judy A. Franklin,et al. Biped dynamic walking using reinforcement learning , 1997, Robotics Auton. Syst..
[41] Atsuo Takanishi,et al. Development of a biped walking robot having antagonistic driven joints using nonlinear spring mechanism , 1997, Proceedings of International Conference on Robotics and Automation.
[42] Claude F. Touzet,et al. Neural reinforcement learning for behaviour synthesis , 1997, Robotics Auton. Syst..
[43] Geoffrey E. Hinton,et al. Using Expectation-Maximization for Reinforcement Learning , 1997, Neural Computation.
[44] Stefan Schaal,et al. Robot Learning From Demonstration , 1997, ICML.
[45] Christopher G. Atkeson,et al. Nonparametric Model-Based Reinforcement Learning , 1997, NIPS.
[46] András Lörincz,et al. Module Based Reinforcement Learning: An Application to a Real Robot , 1997, EWLR.
[47] Maja J. Mataric,et al. Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.
[48] Roderic A. Grupen,et al. A feedback control structure for on-line learning tasks , 1997, Robotics Auton. Syst..
[49] J. Doyle,et al. Essentials of Robust Control , 1997 .
[50] Andrew G. Barto,et al. Reinforcement learning , 1998 .
[51] Minoru Asada,et al. Cooperative behavior acquisition in multi-mobile robots environment by reinforcement learning based on state vector estimation , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).
[52] Frank Kirchner. Q-learning of complex behaviours on a six-legged walking machine , 1998, Robotics Auton. Syst..
[53] Leslie Pack Kaelbling,et al. A Framework for Reinforcement Learning on Real Robots , 1998, AAAI/IAAI.
[54] Gerald Sommer,et al. Integrating symbolic knowledge in reinforcement learning , 1998, SMC'98 Conference Proceedings. 1998 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.98CH36218).
[55] Stuart J. Russell. Learning agents for uncertain environments (extended abstract) , 1998, COLT' 98.
[56] Preben Alstrøm,et al. Learning to Drive a Bicycle Using Reinforcement Learning and Shaping , 1998, ICML.
[57] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .
[58] Mark D. Pendrith. Reinforcement Learning in Situated Agents: Theoretical and Practical Solutions , 1999, EWLR.
[59] PracticalSolutionsMark D. Pendrith. Reinforcement Learning in Situated Agents : Some Theoretical Problems and , 1999 .
[60] Geoffrey J. Gordon,et al. Approximate solutions to markov decision processes , 1999 .
[61] Karsten Berns,et al. Adaptive periodic movement control for the four legged walking machine BISAM , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).
[62] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[63] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[64] Stefan Schaal,et al. Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.
[65] Luke Fletcher,et al. Reinforcement learning for a vision based mobile robot , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).
[66] Gregory Piatetsky-Shapiro,et al. High-Dimensional Data Analysis: The Curses and Blessings of Dimensionality , 2000 .
[67] Shigenobu Kobayashi,et al. Reinforcement learning of walking behavior for a four-legged robot , 2001, Proceedings of the 40th IEEE Conference on Decision and Control (Cat. No.01CH37228).
[68] Andrew W. Moore,et al. Direct Policy Search using Paired Statistical Tests , 2001, ICML.
[69] Kazuaki Yamada,et al. Emergent synthesis of motion patterns for locomotion robots , 2001, Artif. Intell. Eng..
[70] Jeff G. Schneider,et al. Autonomous helicopter control using reinforcement learning policy search methods , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[71] Christopher G. Atkeson,et al. Learning from observation using primitives , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).
[72] Jun Morimoto,et al. Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..
[73] Shin Ishii,et al. Reinforcement Learning for Biped Locomotion , 2002, ICANN.
[74] A. Shwartz,et al. Handbook of Markov decision processes : methods and applications , 2002 .
[75] Andrew G. Barto,et al. Lyapunov Design for Safe Reinforcement Learning , 2003, J. Mach. Learn. Res..
[76] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.
[77] Xiao Huang,et al. Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .
[78] Mikael Norrlöf,et al. An adaptive iterative learning control algorithm with experiments on an industrial robot , 2002, IEEE Trans. Robotics Autom..
[79] John Langford,et al. Approximately Optimal Approximate Reinforcement Learning , 2002, ICML.
[80] Goldberg,et al. Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.
[81] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[82] Leslie Pack Kaelbling,et al. Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).
[83] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[84] Martin A. Riedmiller,et al. Reinforcement learning on an omnidirectional mobile robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[85] James C. Spall,et al. Introduction to stochastic search and optimization - estimation, simulation, and control , 2003, Wiley-Interscience series in discrete mathematics and optimization.
[86] S. Shankar Sastry,et al. Autonomous Helicopter Flight via Reinforcement Learning , 2003, NIPS.
[87] Jeff G. Schneider,et al. Covariant Policy Search , 2003, IJCAI.
[88] Sham M. Kakade,et al. On the sample complexity of reinforcement learning. , 2003 .
[89] Jürgen Schmidhuber,et al. A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[90] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[91] Gordon Cheng,et al. Learning from Observation and from Practice Using Behavioral Primitives , 2003, ISRR.
[92] T. J. Rivlin. An Introduction to the Approximation of Functions , 2003 .
[93] Jeff G. Schneider,et al. Policy Search by Dynamic Programming , 2003, NIPS.
[94] Darrin C. Bentivegna,et al. Learning From Observation and Practice Using Behavioral Primitives : Marble Maze , 2004 .
[95] Peter L. Bartlett,et al. Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning , 2001, J. Mach. Learn. Res..
[96] Peggy Fidelman,et al. Learning Ball Acquisition on a Physical Robot , 2004 .
[97] Dieter Fox,et al. Reinforcement learning for sensing strategies , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[98] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[99] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.
[100] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[101] H. Sebastian Seung,et al. Stochastic policy gradient reinforcement learning on a simple 3D biped , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).
[102] Stefan Schaal,et al. Scalable Techniques from Nonparametric Statistics for Real Time Robot Learning , 2002, Applied Intelligence.
[103] Ben Tse,et al. Autonomous Inverted Helicopter Flight via Reinforcement Learning , 2004, ISER.
[104] Andrew W. Moore,et al. Prioritized sweeping: Reinforcement learning with less data and less time , 2004, Machine Learning.
[105] A. Moore,et al. Learning decisions: robustness, uncertainty, and approximation , 2004 .
[106] Dirk P. Kroese,et al. The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .
[107] G. DeJong,et al. Theory and Application of Reward Shaping in Reinforcement Learning , 2004 .
[108] Michael Kearns,et al. Near-Optimal Reinforcement Learning in Polynomial Time , 1998, Machine Learning.
[109] M.T. Rosenstein,et al. Reinforcement learning with supervision by a stable controller , 2004, Proceedings of the 2004 American Control Conference.
[110] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[111] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.
[112] Christopher K. I. Williams,et al. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .
[113] Ashutosh Saxena,et al. High speed obstacle avoidance using monocular vision and reinforcement learning , 2005, ICML.
[114] Takayuki Kanda,et al. Robot behavior adaptation for human-robot interaction based on policy gradient reinforcement learning , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[115] H. Sebastian Seung,et al. Learning to Walk in 20 Minutes , 2005 .
[116] H. Kappen. Path integrals and symmetry breaking for optimal control theory , 2005, physics/0505066.
[117] Minoru Asada,et al. Purposive Behavior Acquisition for a Real Robot by Vision-Based Reinforcement Learning , 2005, Machine Learning.
[118] John Langford,et al. Relating reinforcement learning performance to classification performance , 2005, ICML '05.
[119] Maarten Steinbuch,et al. Learning-based identification and iterative learning control of direct-drive robots , 2005, IEEE Transactions on Control Systems Technology.
[120] Vishal Soni,et al. Reinforcement learning of hierarchical skills on the sony aibo robot , 2005, AAAI 2005.
[121] Florentin Wörgötter,et al. Fast biped walking with a reflexive controller and real-time policy searching , 2005, NIPS.
[122] Tomás Martínez-Marín,et al. Fast Reinforcement Learning for Vision-guided Mobile Robots , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[123] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[124] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[125] J. Andrew Bagnell,et al. Maximum margin planning , 2006, ICML.
[126] Pieter Abbeel,et al. Using inaccurate models in reinforcement learning , 2006, ICML.
[127] Emanuel Todorov,et al. Optimal Control Theory , 2006 .
[128] H. Liu,et al. A Heuristic Reinforcement Learning for Robot Approaching Objects , 2006, 2006 IEEE Conference on Robotics, Automation and Mechatronics.
[129] Christopher M. Bishop,et al. Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .
[130] David M. Bradley,et al. Boosting Structured Prediction for Imitation Learning , 2006, NIPS.
[131] Sven Behnke,et al. Imitative Reinforcement Learning for Soccer Playing Robots , 2006, RoboCup.
[132] Wolfram Burgard,et al. Learning Relational Navigation Policies , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[133] Robert Platt,et al. Improving Grasp Skills Using Schema Structured Learning , 2006 .
[134] Jürgen Schmidhuber,et al. Quasi-online reinforcement learning for robots , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..
[135] Aude Billard,et al. Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.
[136] W. Burgard,et al. Autonomous blimp control using model-free reinforcement learning in a continuous state and action space , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[137] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[138] Lucas Paletta,et al. Perception and Developmental Learning of Affordances in Autonomous Robots , 2007, KI.
[139] Martin A. Riedmiller,et al. Neural Reinforcement Learning Controllers for a Real Robot Application , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[140] Pieter Abbeel,et al. Hierarchical Apprenticeship Learning with Application to Quadruped Locomotion , 2007, NIPS.
[141] I. Elhanany. Reinforcement Learning in Sensor-Guided AIBO Robots , 2007 .
[142] Stefan Schaal,et al. Dynamics systems vs. optimal control--a unifying view. , 2007, Progress in brain research.
[143] Siddhartha S. Srinivasa,et al. Imitation learning for locomotion and manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.
[144] Dieter Fox,et al. Gaussian Processes and Reinforcement Learning for Identification and Control of an Autonomous Blimp , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.
[145] Richard S. Sutton,et al. On the role of tracking in stationary environments , 2007, ICML '07.
[146] Tao Wang,et al. Automatic Gait Optimization with Gaussian Process Regression , 2007, IJCAI.
[147] Richard Alan Peters,et al. Reinforcement Learning with a Supervisor for a Mobile Robot in a Real-world Environment , 2007, 2007 International Symposium on Computational Intelligence in Robotics and Automation.
[148] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.
[149] Yi Gu,et al. Space-indexed dynamic programming: learning to follow trajectories , 2008, ICML '08.
[150] David Silver,et al. High Performance Outdoor Navigation from Overhead Data using Imitation Learning , 2008, Robotics: Science and Systems.
[151] M. Goodman. Learning to Walk: The Origins of the UK's Joint Intelligence Committee , 2008 .
[152] Astrophysics Departm. Reinforcement Learning of Behaviors in Mobile Robots Using Noisy Infrared Sensing , 2008 .
[153] Sebastian Thrun,et al. Apprenticeship learning for motion planning with application to parking lot navigation , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[154] Jun Nakanishi,et al. Operational Space Control: A Theoretical and Empirical Comparison , 2008, Int. J. Robotics Res..
[155] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[156] Yong Duan,et al. Robot Navigation Based on Fuzzy RL Algorithm , 2008, ISNN.
[157] Kemal Leblebicioglu,et al. Free gait generation with reinforcement learning for a six-legged robot , 2008, Robotics Auton. Syst..
[158] Nicholas Roy,et al. Trajectory Optimization using Reinforcement Learning for Map Exploration , 2008, Int. J. Robotics Res..
[159] Stefan Schaal,et al. Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.
[160] Tomohiro Shibata,et al. Policy Gradient Learning of Cooperative Interaction with a Robot Using User's Biological Signals , 2009, ICONIP.
[161] Stefan Schaal,et al. Proc. Advances in Neural Information Processing Systems (NIPS '08) , 2008 .
[162] Jun Morimoto,et al. Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..
[163] Betty J. Mohler,et al. Learning perceptual coupling for motor primitives , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[164] Kazuhiro Ohkura,et al. A Reinforcement Learning Technique with an Adaptive Action Generator for a Multi-robot System , 2008, SAB.
[165] Stefan Schaal,et al. Learning to Control in Operational Space , 2008, Int. J. Robotics Res..
[166] Brett Browning,et al. Learning robot motion control with demonstration and advice-operators , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[167] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[168] Oliver Brock,et al. Learning to Manipulate Articulated Objects in Unstructured Environments Using a Grounded Relational Representation , 2008, Robotics: Science and Systems.
[169] Jan Peters,et al. Learning motor primitives for robotics , 2009, 2009 IEEE International Conference on Robotics and Automation.
[170] Ales Ude,et al. Task adaptation through exploration and action sequencing , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.
[171] Michel Tokic,et al. The Crawler, A Class Room Demonstrator for Reinforcement Learning , 2009, FLAIRS.
[172] John Langford,et al. Search-based structured prediction , 2009, Machine Learning.
[173] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..
[174] Andrew Y. Ng,et al. Regularization and feature selection in least-squares temporal difference learning , 2009, ICML '09.
[175] Pieter Abbeel,et al. Apprenticeship learning for helicopter control , 2009, CACM.
[176] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.
[177] Andrew Y. Ng,et al. Policy search via the signed derivative , 2009, Robotics: Science and Systems.
[178] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[179] Oliver Kroemer,et al. Learning Visual Representations for Interactive Systems , 2009, ISRR.
[180] Oliver Kroemer,et al. Towards Motor Skill Learning for Robotics , 2007, ISRR.
[181] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[182] Oliver Kroemer,et al. Active learning using mean shift optimization for robot grasping , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[183] Sethu Vijayakumar,et al. Using dimensionality reduction to exploit constraints in reinforcement learning , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[184] Maren Bennewitz,et al. Learning reliable and efficient navigation with a humanoid , 2010, 2010 IEEE International Conference on Robotics and Automation.
[185] Stefan Schaal,et al. Reinforcement learning of motor skills in high dimensions: A path integral approach , 2010, 2010 IEEE International Conference on Robotics and Automation.
[186] Pieter Abbeel,et al. Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.
[187] Christopher G. Atkeson,et al. Control of Instantaneously Coupled Systems applied to humanoid walking , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.
[188] Ian R. Manchester,et al. LQR-trees: Feedback Motion Planning via Sums-of-Squares Verification , 2010, Int. J. Robotics Res..
[189] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .
[190] Oliver Kroemer,et al. Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..
[191] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[192] Yasemin Altun,et al. Relative Entropy Policy Search , 2010 .
[193] Jun Zhang,et al. Motor Learning at Intermediate Reynolds Number: Experiments with Policy Gradient on the Flapping Flight of a Rigid Wing , 2010, From Motor Learning to Interaction Learning in Robots.
[194] Richard L. Lewis,et al. Reward Design via Online Gradient Ascent , 2010, NIPS.
[195] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .
[196] Bojan Nemec,et al. Learning of a ball-in-a-cup playing robot , 2010, 19th International Workshop on Robotics in Alpe-Adria-Danube Region (RAAD 2010).
[197] Marc Peter Deisenroth,et al. A Practical and Conceptual Framework for Learning in Control , 2010 .
[198] Sebastian Thrun,et al. A probabilistic approach to mixed open-loop and closed-loop control, with application to extreme autonomous driving , 2010, 2010 IEEE International Conference on Robotics and Automation.
[199] Eric Rogers,et al. Iterative learning control applied to a gantry robot and conveyor system , 2010 .
[200] David Silver,et al. Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain , 2010, Int. J. Robotics Res..
[201] Peter Stone,et al. Generalized model learning for Reinforcement Learning on a humanoid robot , 2010, 2010 IEEE International Conference on Robotics and Automation.
[202] Olivier Sigaud,et al. From Motor Learning to Interaction Learning in Robots , 2010, From Motor Learning to Interaction Learning in Robots.
[203] Jörg Stückler,et al. Learning Motion Skills from Expert Demonstrations and Own Experience using Gaussian Process Regression , 2010, ISR/ROBOTIK.
[204] Marc Toussaint,et al. Bayesian Time Series Models: Expectation maximisation methods for solving (PO)MDPs and optimal control problems , 2011 .
[205] Scott Kuindersma,et al. Autonomous Skill Acquisition on a Mobile Manipulator , 2011, AAAI.
[206] Ian R. Manchester,et al. Feedback controller parameterizations for Reinforcement Learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[207] Stephen Hart,et al. Learning Generalizable Control Programs , 2011, IEEE Transactions on Autonomous Mental Development.
[208] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.
[209] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.
[210] Stefan Schaal,et al. Learning variable impedance control , 2011, Int. J. Robotics Res..
[211] Martial Hebert,et al. Learning message-passing inference machines for structured prediction , 2011, CVPR 2011.
[212] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[213] Stefan Schaal,et al. Skill learning and task outcome prediction for manipulation , 2011, 2011 IEEE International Conference on Robotics and Automation.
[214] P. Schrimpf,et al. Dynamic Programming , 2011 .
[215] Stefan Schaal,et al. Learning motion primitive goals for robust manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[216] Jan Peters,et al. Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.
[217] Ales Ude,et al. Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives , 2011, Robotics Auton. Syst..
[218] Stefan Schaal,et al. Learning force control policies for compliant manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[219] Oliver Kroemer,et al. Learning visual representations for perception-action systems , 2011, Int. J. Robotics Res..
[220] Howie Choset,et al. Using response surfaces and expected improvement to optimize snake robot gait parameters , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[221] George Konidaris,et al. Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.
[222] Scott Kuindersma,et al. Learning dynamic arm motions for postural recovery , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.
[223] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.
[224] J. Andrew Bagnell,et al. Agnostic System Identification for Model-Based Reinforcement Learning , 2012, ICML.
[225] Warren B. Powell,et al. AI, OR and Control Theory: A Rosetta Stone for Stochastic Optimization , 2012 .
[226] J. Andrew Bagnell,et al. Reinforcement Planning: RL for optimal planners , 2012, 2012 IEEE International Conference on Robotics and Automation.
[227] Jan Peters,et al. Learning concurrent motor skills in versatile solution spaces , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[228] Scott Kuindersma,et al. Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..
[229] Olivier Sigaud,et al. Path Integral Policy Improvement with Covariance Matrix Adaptation , 2012, ICML.
[230] Pieter Abbeel,et al. Safe Exploration in Markov Decision Processes , 2012, ICML.
[231] Peter Stone,et al. RTMBA: A Real-Time Model-Based Reinforcement Learning Architecture for robot control , 2011, 2012 IEEE International Conference on Robotics and Automation.
[232] Hsien-I Lin,et al. Learning collision-free reaching skill from primitives , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[233] Sanjiban Choudhury. Application of Reinforcement Learning in Robot Soccer ! , 2013 .
[234] Andreas Krause,et al. Advances in Neural Information Processing Systems (NIPS) , 2014 .
[235] P. Glynn. LIKELIHOOD RATIO GRADIENT ESTIMATION : AN OVERVIEW by , 2022 .
[236] K. Schittkowski,et al. NONLINEAR PROGRAMMING , 2022 .