Learning Search Strategies from Human Demonstrations
暂无分享,去创建一个
[1] Sebastian Thrun,et al. Simultaneous localization and mapping with unknown data association using FastSLAM , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).
[2] David Hsu,et al. SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.
[3] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[4] Aude Billard,et al. Online learning of varying stiffness through physical human-robot interaction , 2012, 2012 IEEE International Conference on Robotics and Automation.
[5] Nancy M. Amato,et al. FIRM: Feedback controller-based information-state roadmap - A framework for motion planning under uncertainty , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[6] A. Leslie. Mapping the mind: ToMM, ToBY, and Agency: Core architecture and domain specificity , 1994 .
[7] G. A. Miller. THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .
[8] Sebastian Scherer,et al. Learning obstacle avoidance parameters from operator behavior , 2006, J. Field Robotics.
[9] Marc Toussaint,et al. POMDP manipulation via trajectory optimization , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[10] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[11] B. Sodian,et al. Theory of Mind , 2010 .
[12] Wolfram Burgard,et al. Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .
[13] Cyrill Stachniss,et al. Simultaneous Localization and Mapping , 2016, Springer Handbook of Robotics, 2nd Ed..
[14] Aude Billard,et al. Learning search polices from humans in a partially observable context , 2014, ROBIO 2014.
[15] Joelle Pineau,et al. Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..
[16] Peter Vrancx,et al. Game Theory and Multi-agent Reinforcement Learning , 2012, Reinforcement Learning.
[17] Dong Chen,et al. An uncertainty-aware precision grasping process for objects with unknown dimensions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).
[18] Jimmy A. Jørgensen,et al. Transfer of assembly operations to new workpiece poses by adaptation to the desired force profile , 2013, 2013 16th International Conference on Advanced Robotics (ICAR).
[19] D. Bernoulli. Exposition of a New Theory on the Measurement of Risk , 1954 .
[20] Mary Hegarty,et al. What determines our navigational abilities? , 2010, Trends in Cognitive Sciences.
[21] Leslie Pack Kaelbling,et al. Belief space planning assuming maximum likelihood observations , 2010, Robotics: Science and Systems.
[22] Kenji Doya,et al. EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot , 2015, Artificial Life and Robotics.
[23] N. Roy,et al. The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance , 2009, Int. J. Robotics Res..
[24] Fillia Makedon,et al. Approximate planning in POMDPs via MDP heuristic , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).
[25] George Apostolakis,et al. Decision theory , 1986 .
[26] Stefan Schaal,et al. Reinforcement Learning With Sequences of Motion Primitives for Robust Manipulation , 2012, IEEE Transactions on Robotics.
[27] Jan Peters,et al. Fitted Q-iteration by Advantage Weighted Regression , 2008, NIPS.
[28] Stefan Schaal,et al. Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.
[29] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[30] Reid G. Simmons,et al. Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.
[31] Joelle Pineau,et al. Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.
[32] Nikos A. Vlassis,et al. Planning with Continuous Actions in Partially Observable Environments , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.
[33] Ron Alterovitz,et al. Motion planning under uncertainty using iterative local optimization in belief space , 2012, Int. J. Robotics Res..
[34] Peter Stone,et al. Reinforcement learning , 2019, Scholarpedia.
[35] Bart De Schutter,et al. Approximate reinforcement learning: An overview , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[36] Hugh F. Durrant-Whyte,et al. Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.
[37] Stefan Schaal,et al. Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[38] Sebastian Thrun,et al. Monte Carlo POMDPs , 1999, NIPS.
[39] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[40] Stefan Schaal,et al. Learning motion primitive goals for robust manipulation , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[41] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[42] Jonathan Baxter,et al. Scaling Internal-State Policy-Gradient Methods for POMDPs , 2002 .
[43] Rüdiger Dillmann,et al. Solving Continuous POMDPs: Value Iteration with Incremental Learning of an Efficient Space Representation , 2013, ICML.
[44] Pedro U. Lima,et al. Point-Based POMDP Solving with Factored Value Function Approximation , 2014, AAAI.
[45] Jiming Liu,et al. Improving POMDP Tractability via Belief Compression and Clustering , 2010, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[46] Alejandro Agostini,et al. Reinforcement Learning with a Gaussian mixture model , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[47] Ron Alterovitz,et al. Motion planning under uncertainty for medical needle steering using optimization in belief space , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[48] Roderic A. Grupen,et al. Learning admittance mappings for force-guided assembly , 1994, Proceedings of the 1994 IEEE International Conference on Robotics and Automation.
[49] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[50] D. Warner North,et al. A Tutorial Introduction to Decision Theory , 1968, IEEE Trans. Syst. Sci. Cybern..
[51] Joshua B. Tenenbaum,et al. Bayesian models of human action understanding , 2005, NIPS.
[52] Aude Billard,et al. Robot learning by demonstration , 2013, Scholarpedia.
[53] E. Spelke,et al. Updating egocentric representations in human navigation , 2000, Cognition.
[54] Nicholas Roy,et al. Exponential Family PCA for Belief Compression in POMDPs , 2002, NIPS.
[55] Florian Schmidt,et al. Sequential trajectory re-planning with tactile information gain for dexterous grasping under object-pose uncertainty , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[56] William D. Smart,et al. A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation , 2010, UAI.
[57] Wolfram Burgard,et al. Gaussian Beam Processes: A Nonparametric Bayesian Measurement Model for Range Finders , 2007, Robotics: Science and Systems.
[58] Joshua B. Tenenbaum,et al. Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution , 2011, CogSci.
[59] Michael S. Branicky,et al. Search strategies for peg-in-hole assemblies with position uncertainty , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).
[60] P. Lavenex,et al. Human short-term spatial memory: Precision predicts capacity , 2015, Cognitive Psychology.
[61] Edward J. Sondik,et al. The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..
[62] Kamal K. Gupta,et al. RRT-SLAM for motion planning with motion and map uncertainty for robot exploration , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[63] Alex Brooks,et al. A Monte Carlo Update for Parametric POMDPs , 2007, ISRR.
[64] Danica Kragic,et al. Dexterous grasping under shape uncertainty , 2016, Robotics Auton. Syst..
[65] Sebastian Thrun,et al. FastSLAM 2.0: an improved particle filtering algorithm for simultaneous localization and mapping that provably converges , 2003, IJCAI 2003.
[66] Klas Kronander,et al. Control and Learning of Compliant Manipulation Skills , 2015 .
[67] Aude Billard,et al. Non-Parametric Bayesian State Space Estimator for Negative Information , 2017, Front. Robot. AI.
[68] Jun Nakanishi,et al. Learning Movement Primitives , 2005, ISRR.
[69] Héctor H. González-Baños,et al. Navigation Strategies for Exploring Indoor Environments , 2002, Int. J. Robotics Res..
[70] Wolfram Burgard,et al. Information Gain-based Exploration Using Rao-Blackwellized Particle Filters , 2005, Robotics: Science and Systems.
[71] Andrew W. Moore,et al. Generalization in Reinforcement Learning: Safely Approximating the Value Function , 1994, NIPS.
[72] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[73] Ross D. Shachter. Bayes-Ball: The Rational Pastime (for Determining Irrelevance and Requisite Information in Belief Networks and Influence Diagrams) , 1998, UAI.
[74] Holger Voos,et al. Controller design for quadrotor UAVs using reinforcement learning , 2010, 2010 IEEE International Conference on Control Applications.
[75] J. Neumann,et al. The Theory of Games and Economic Behaviour , 1944 .
[76] Aude Billard,et al. Learning search behaviour from humans , 2013, 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO).
[77] Pascal Poupart,et al. Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..
[78] Balaraman Ravindran,et al. Where do i look now? Gaze allocation during visually guided manipulation , 2012, 2012 IEEE International Conference on Robotics and Automation.
[79] Pierre Geurts,et al. Tree-Based Batch Mode Reinforcement Learning , 2005, J. Mach. Learn. Res..
[80] Jaime Valls Miró,et al. Active Pose SLAM , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[81] Reid G. Simmons,et al. Heuristic Search Value Iteration for POMDPs , 2004, UAI.
[82] Seung-kook Yun,et al. Compliant manipulation for peg-in-hole: Is passive compliance a key to learn contact motion? , 2008, 2008 IEEE International Conference on Robotics and Automation.
[83] H. Sung. Gaussian Mixture Regression and Classification , 2004 .
[84] Peter L. Bartlett,et al. Reinforcement Learning in POMDP's via Direct Gradient Ascent , 2000, ICML.
[85] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[86] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[87] Mohamad Bdiwi,et al. Improved peg-in-hole (5-pin plug) task: Intended for charging electric vehicles by robot system automatically , 2015, 2015 IEEE 12th International Multi-Conference on Systems, Signals & Devices (SSD15).
[88] M. Proulx,et al. Visual experience facilitates allocentric spatial representation , 2013, Behavioural Brain Research.
[89] Aude Billard,et al. Learning from failed demonstrations in unreliable systems , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[90] Sandra Hirche,et al. Risk-sensitive interaction control in uncertain manipulation tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.
[91] W. Fisher,et al. Hybrid Position/Force Control: A Correct Formulation , 1992 .
[92] Wolfram Burgard,et al. Coastal navigation-mobile robot navigation with uncertainty in dynamic environments , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).
[93] N. Burgess,et al. Spatial memory: how egocentric and allocentric combine , 2006, Trends in Cognitive Sciences.
[94] P. Abbeel,et al. LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2011 .
[95] Juan Andrade-Cetto,et al. Dense entropy decrease estimation for mobile robot exploration , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[96] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[97] Gang Niu,et al. Regularized Policy Gradients: Direct Variance Reduction in Policy Gradient Estimation , 2015, ACML.
[98] Heping Chen,et al. Online parameter optimization in robotic force controlled assembly processes , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[99] Ales Ude,et al. Solving peg-in-hole tasks by human demonstration and exception strategies , 2014 .
[100] Sebastian Thrun,et al. Coastal Navigation with Mobile Robots , 1999, NIPS.
[101] Sebastian Thrun,et al. Particle Filters in Robotics , 2002, UAI.
[102] Daniel Vélez Día,et al. Biomechanics and Motor Control of Human Movement , 2013 .
[103] Geoffrey J. Gordon,et al. Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..
[104] Michael I. Jordan,et al. PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.
[105] Kurt Konolige,et al. Autonomous door opening and plugging in with a personal robot , 2010, 2010 IEEE International Conference on Robotics and Automation.
[106] L. L. Lin,et al. Fast programming of Peg-in-hole Actions by human demonstration , 2014, 2014 International Conference on Mechatronics and Control (ICMC).
[107] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[108] Peter W. Glynn,et al. Kernel-Based Reinforcement Learning in Average-Cost Problems: An Application to Optimal Portfolio Choice , 2000, NIPS.
[109] Joshua B. Tenenbaum,et al. The Development of Joint Belief-Desire Inferences , 2012, CogSci.
[110] Neil J. Gordon,et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..
[111] Peter N. C. Mohr,et al. Decision making under uncertainty , 2013, Front. Neurosci..
[112] Christian Büchel,et al. Spatial updating: how the brain keeps track of changing object locations during observer motion , 2008, Nature Neuroscience.
[113] Wee Sun Lee,et al. A POMDP Approach to Robot Motion Planning under Uncertainty , 2010 .
[114] Gordon E Legge,et al. Lost in virtual space: studies in human and ideal spatial navigation. , 2006, Journal of experimental psychology. Human perception and performance.
[115] Brian Scassellati,et al. Theory of Mind for a Humanoid Robot , 2002, Auton. Robots.
[116] Nancy M. Amato,et al. Robust online belief space planning in changing environments: Application to physical mobile robots , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).
[117] Leslie Pack Kaelbling,et al. Non-Gaussian belief space planning: Correctness and complexity , 2012, 2012 IEEE International Conference on Robotics and Automation.
[118] Wolfram Burgard,et al. A Tutorial on Graph-Based SLAM , 2010, IEEE Intelligent Transportation Systems Magazine.
[119] Sebastian Thrun,et al. Integrating Grid-Based and Topological Maps for Mobile Robot Navigation , 1996, AAAI/IAAI, Vol. 2.
[120] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[121] Andrew W. Moore,et al. Locally Weighted Learning , 1997, Artificial Intelligence Review.
[122] David Silver,et al. Learning from Demonstration for Autonomous Navigation in Complex Unstructured Terrain , 2010, Int. J. Robotics Res..
[123] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[124] Moonhong Baeg,et al. Intuitive peg-in-hole assembly strategy with a compliant manipulator , 2013, IEEE ISR 2013.