A Review of Robot Learning for Manipulation: Challenges, Representations, and Algorithms

A key challenge in intelligent robotics is creating robots that are capable of directly interacting with the world around them to achieve their goals. The last decade has seen substantial growth in research on the problem of robot manipulation, which aims to exploit the increasing availability of affordable robot arms and grippers to create robots capable of directly interacting with the world to achieve their goals. Learning will be central to such autonomous systems, as the real world contains too much variation for a robot to expect to have an accurate model of its environment, the objects in it, or the skills required to manipulate them, in advance. We aim to survey a representative subset of that research which uses machine learning for manipulation. We describe a formalization of the robot manipulation learning problem that synthesizes existing research into a single coherent framework and highlight the many remaining research opportunities and challenges.

[1]  Jean-Claude Latombe,et al.  Motion planning for legged and humanoid robots , 2008 .

[2]  Leslie Pack Kaelbling,et al.  Interactive Bayesian identification of kinematic mechanisms , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[4]  Yun Jiang,et al.  Learning to place new objects in a scene , 2012, Int. J. Robotics Res..

[5]  Danica Kragic,et al.  Learning grasping points with shape context , 2010, Robotics Auton. Syst..

[6]  Gaurav S. Sukhatme,et al.  Force estimation and slip detection/classification for grip control using a biomimetic tactile sensor , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[7]  Robert Platt,et al.  Viewpoint selection for grasp detection , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8]  Sergey Levine,et al.  Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.

[9]  Rustam Stolkin,et al.  Learning modular and transferable forward models of the motions of push manipulated objects , 2017, Auton. Robots.

[10]  Nando de Freitas,et al.  Sample Efficient Actor-Critic with Experience Replay , 2016, ICLR.

[11]  Oliver Kroemer,et al.  Learning Robust Manipulation Strategies with Multimodal State Transition Models and Recovery Heuristics , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[12]  Oliver Kroemer,et al.  Predicting object interactions from contact distributions , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Danfei Xu,et al.  Multi-sensor surface analysis for robotic ironing , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Justus H. Piater,et al.  Refining discovered symbols with multi-step interaction experience , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[15]  Peter Englert,et al.  Combined Optimization and Reinforcement Learning for Manipulation Skills , 2016, Robotics: Science and Systems.

[16]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.

[17]  Connor Schenck,et al.  The Object Pairing and Matching Task : Toward Montessori Tests for Robots , 2012 .

[18]  Sonia Chernova,et al.  Integrating reinforcement learning with human demonstrations of varying ability , 2011, AAMAS.

[19]  Peter Stone,et al.  Source Task Creation for Curriculum Learning , 2016, AAMAS.

[20]  Odest Chadwicke Jenkins,et al.  Semantic Robot Programming for Goal-Directed Manipulation in Cluttered Scenes , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Danica Kragic,et al.  Predicting slippage and learning manipulation affordances through Gaussian Process regression , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[22]  Leslie Pack Kaelbling,et al.  Efficient Planning in Non-Gaussian Belief Spaces and Its Application to Robot Grasping , 2011, ISRR.

[23]  Oliver Kroemer,et al.  Learning sequential motor tasks , 2013, 2013 IEEE International Conference on Robotics and Automation.

[24]  Karol Hausman,et al.  Segmentation of Cluttered Scenes through Interactive Perception , 2012 .

[25]  Shakir Mohamed,et al.  Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning , 2015, NIPS.

[26]  Alexander Stoytchev,et al.  Object Categorization in the Sink : Learning Behavior – Grounded Object Categories with Water , 2012 .

[27]  Sergey Levine,et al.  Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  J PappasGeorge,et al.  Temporal logic motion planning for dynamic robots , 2009 .

[29]  Abhinav Gupta,et al.  Interpretable Intuitive Physics Model , 2018, ECCV.

[30]  Shiyang Lu,et al.  Factored Pose Estimation of Articulated Objects using Efficient Nonparametric Belief Propagation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[31]  Sergey Levine,et al.  Reasoning About Physical Interactions with Object-Oriented Prediction and Planning , 2018, ICLR.

[32]  M. Arbib,et al.  Infant grasp learning: a computational model , 2004, Experimental Brain Research.

[33]  Stefan Schaal,et al.  The Coordinate Particle Filter - a novel Particle Filter for high dimensional systems , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Yuchen Cui,et al.  Active Reward Learning from Critiques , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Andrew Y. Ng,et al.  Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.

[36]  Danica Kragic,et al.  Task-Based Grasp Adaptation on a Humanoid Robot , 2012, SyRoCo.

[37]  James M. Rehg,et al.  Guided pushing for object singulation , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  Sergey Levine,et al.  Grasp2Vec: Learning Object Representations from Self-Supervised Grasping , 2018, CoRL.

[39]  Oliver Brock,et al.  Interactive Perception of Articulated Objects , 2010, ISER.

[40]  Deepak Pathak,et al.  Self-Supervised Exploration via Disagreement , 2019, ICML.

[41]  Pieter Abbeel,et al.  Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[42]  Danica Kragic,et al.  Enhancing visual perception of shape through tactile glances , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Dileep George,et al.  Schema Networks: Zero-shot Transfer with a Generative Causal Model of Intuitive Physics , 2017, ICML.

[44]  Qiang Liu,et al.  Learning to Explore via Meta-Policy Gradient , 2018, ICML.

[45]  Sergey Levine,et al.  Extending Deep Model Predictive Control with Safety Augmented Value Estimation from Demonstrations , 2019, ArXiv.

[46]  Silvio Savarese,et al.  Neural Task Programming: Learning to Generalize Across Hierarchical Tasks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[47]  Peter K. Allen,et al.  Semantic grasping: Planning robotic grasps functionally suitable for an object manipulation task , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[48]  Gaurav S. Sukhatme,et al.  Learning to Switch Between Sensorimotor Primitives Using Multimodal Haptic Signals , 2016, SAB.

[49]  B. Anderson,et al.  Nonlinear regulator theory and an inverse optimal control problem , 1973 .

[50]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[51]  Richard L. Lewis,et al.  Reward Design via Online Gradient Ascent , 2010, NIPS.

[52]  Daniele Moro,et al.  Control of Tendon-Driven Soft Foam Robot Hands , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[53]  Gaurav S. Sukhatme,et al.  Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning , 2017, ICML.

[54]  David Watkins-Valls,et al.  Multi-Modal Geometric Learning for Grasping and Manipulation , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[55]  Sergey Levine,et al.  Stochastic Variational Video Prediction , 2017, ICLR.

[56]  Byron Boots,et al.  Robust Learning of Tactile Force Estimation through Robot Interaction , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[57]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[58]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[59]  Scott Niekum,et al.  Efficient Hierarchical Robot Motion Planning Under Uncertainty and Hybrid Dynamics , 2018, CoRL.

[60]  Ruzena Bajcsy,et al.  Segmentation via manipulation , 1991, IEEE Trans. Robotics Autom..

[61]  Siddhartha S. Srinivasa,et al.  Object Modeling and Recognition from Sparse, Noisy Data via Voxel Depth Carving , 2014, ISER.

[62]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[63]  Sergey Levine,et al.  Learning dexterous manipulation for a soft robotic hand from human demonstrations , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[64]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[65]  Ufuk Topcu,et al.  Towards formal synthesis of reactive controllers for dexterous robotic manipulation , 2012, 2012 IEEE International Conference on Robotics and Automation.

[66]  Jürgen Sturm,et al.  Tactile object class and internal state recognition for mobile manipulation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[67]  Kevin Lee,et al.  Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions , 2014, Int. J. Robotics Res..

[68]  Oliver Kroemer,et al.  Generalization of human grasping for multi-fingered robot hands , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[69]  Advait Jain,et al.  Improving robot manipulation with data-driven object-centric models of everyday forces , 2013, Auton. Robots.

[70]  Byron Boots,et al.  Skill Generalization via Inference-based Planning , 2017 .

[71]  Connor Schenck,et al.  Grounding semantic categories in behavioral interactions: Experiments with 100 objects , 2014, Robotics Auton. Syst..

[72]  Edward H. Adelson,et al.  Discovering states and transformations in image collections , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Danica Kragic,et al.  Learning Predictive State Representation for in-hand manipulation , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[74]  Siddhartha S. Srinivasa,et al.  Hybrid DDP in Clutter (CHDDP): Trajectory Optimization for Hybrid Dynamical System in Cluttered Environments , 2017, ArXiv.

[75]  Jitendra Malik,et al.  Learning Instance Segmentation by Interaction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[76]  Anind K. Dey,et al.  Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[77]  Stefan Schaal,et al.  Depth-based object tracking using a Robust Gaussian Filter , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[78]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[79]  Gaurav S. Sukhatme,et al.  Active articulation model estimation through interactive perception , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[80]  Terence D. Sanger,et al.  Neural network learning control of robot manipulators using gradually increasing task difficulty , 1994, IEEE Trans. Robotics Autom..

[81]  George Konidaris,et al.  Bayesian Eigenobjects: A Unified Framework for 3D Robot Perception , 2017, Robotics: Science and Systems.

[82]  Amos J. Storkey,et al.  Exploration by Random Network Distillation , 2018, ICLR.

[83]  Raia Hadsell,et al.  Graph networks as learnable physics engines for inference and control , 2018, ICML.

[84]  Kate Saenko,et al.  Learning Multi-Level Hierarchies with Hindsight , 2017, ICLR.

[85]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[86]  T. Martin McGinnity,et al.  Automatically composing and parameterizing skills by evolving Finite State Automata , 2012, Robotics Auton. Syst..

[87]  Stefan Schaal,et al.  Movement segmentation using a primitive library , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[88]  J. Doyle,et al.  Robust and optimal control , 1995, Proceedings of 35th IEEE Conference on Decision and Control.

[89]  Ales Ude,et al.  A Simple Ontology of Manipulation Actions Based on Hand-Object Relations , 2013, IEEE Transactions on Autonomous Mental Development.

[90]  Shimon Whiteson,et al.  EFFICIENT ABSTRACTION SELECTION IN REINFORCEMENT LEARNING , 2014, Comput. Intell..

[91]  Shih-Fu Chang,et al.  Model-Driven Feedforward Prediction for Manipulation of Deformable Objects , 2016, IEEE Transactions on Automation Science and Engineering.

[92]  Jin-Hui Zhu,et al.  Affordance Research in Developmental Robotics: A Survey , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[93]  Jan Peters,et al.  Data-Efficient Generalization of Robot Skills with Contextual Policy Search , 2013, AAAI.

[94]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[95]  Niklas Bergström,et al.  On-line learning of temporal state models for flexible objects , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[96]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[97]  Markus Wulfmeier,et al.  Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[98]  S. Shankar Sastry,et al.  A learning based approach to control synthesis of Markov decision processes for linear temporal logic specifications , 2014, 53rd IEEE Conference on Decision and Control.

[99]  Matthew R. Walter,et al.  Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation , 2011, AAAI.

[100]  Lihong Li,et al.  PAC model-free reinforcement learning , 2006, ICML.

[101]  S. Schaal Dynamic Movement Primitives -A Framework for Motor Control in Humans and Humanoid Robotics , 2006 .

[102]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[103]  Anca D. Dragan,et al.  Cooperative Inverse Reinforcement Learning , 2016, NIPS.

[104]  Kristian Kersting,et al.  Inducing Probabilistic Context-Free Grammars for the Sequencing of Movement Primitives , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[105]  Karol Hausman,et al.  Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.

[106]  Ian R. Manchester,et al.  LQR-trees: Feedback Motion Planning via Sums-of-Squares Verification , 2010, Int. J. Robotics Res..

[107]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[108]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[109]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[110]  Peter Stone,et al.  Transfer Learning via Inter-Task Mappings for Temporal Difference Learning , 2007, J. Mach. Learn. Res..

[111]  Sergey Levine,et al.  Temporal Difference Models: Model-Free Deep RL for Model-Based Control , 2018, ICLR.

[112]  Oliver Brock,et al.  Entropy-based strategies for physical exploration of the environment's degrees of freedom , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[113]  Sergey Levine,et al.  One-shot learning of manipulation skills with online dynamics adaptation and neural network priors , 2015, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[114]  Bruno Castro da Silva,et al.  Learning parameterized motor skills on a humanoid robot , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[115]  Han-Pang Huang,et al.  Strategy-based decision making of a soccer robot system using a real-time self-organizing fuzzy decision tree , 2002, Fuzzy Sets Syst..

[116]  Connor Schenck,et al.  Which Object Fits Best? Solving Matrix Completion Tasks with a Humanoid Robot , 2014, IEEE Transactions on Autonomous Mental Development.

[117]  Ngo Anh Vien,et al.  Touch based POMDP manipulation via sequential submodular optimization , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[118]  Robert Platt,et al.  Deictic Image Maps: An Abstraction For Learning Pose Invariant Manipulation Policies , 2018, AAAI.

[119]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[120]  Danica Kragic,et al.  Representations for cross-task, cross-object grasp transfer , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[121]  Danfei Xu,et al.  Folding deformable objects using predictive simulation and trajectory optimization , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[122]  Peter Szabó,et al.  Learning to Control an Octopus Arm with Gaussian Process Temporal Difference Methods , 2005, NIPS.

[123]  Oliver Kroemer,et al.  Learning to predict phases of manipulation tasks as hidden states , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[124]  Connor Schenck,et al.  Perceiving and reasoning about liquids using fully convolutional networks , 2017, Int. J. Robotics Res..

[125]  Gaurav S. Sukhatme,et al.  Meta-level Priors for Learning Manipulation Skills with Sparse Features , 2016, ISER.

[126]  Emre Ugur,et al.  Going beyond the perception of affordances: Learning how to actualize them through behavioral parameters , 2011, 2011 IEEE International Conference on Robotics and Automation.

[127]  Dieter Fox,et al.  SE3-nets: Learning rigid body motion using deep neural networks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[128]  Robert Platt,et al.  Learning 6-DoF Grasping and Pick-Place Using Attention Focus , 2018, CoRL.

[129]  Nuttapong Chentanez,et al.  Intrinsically Motivated Reinforcement Learning , 2004, NIPS.

[130]  Matthew Derry,et al.  Using machine learning to blend human and robot controls for assisted wheelchair navigation , 2013, 2013 IEEE 13th International Conference on Rehabilitation Robotics (ICORR).

[131]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[132]  David Wingate,et al.  A Physics-Based Model Prior for Object-Oriented MDPs , 2014, ICML.

[133]  Stephen Hart,et al.  Intrinsically motivated hierarchical manipulation , 2008, 2008 IEEE International Conference on Robotics and Automation.

[134]  Kenneth O. Stanley,et al.  A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.

[135]  Matei T. Ciocarlie,et al.  Data-driven grasping with partial sensor data , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[136]  Scott Niekum,et al.  Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications , 2018, AAAI.

[137]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[138]  Geir Hovland,et al.  Hidden Markov Models as a Process Monitor in Robotic Assembly , 1998, Int. J. Robotics Res..

[139]  Nicholas Roy,et al.  Efficient Planning for Near-Optimal Compliant Manipulation Leveraging Environmental Contact , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[140]  Ashutosh Saxena,et al.  Learning to Grasp Novel Objects Using Vision , 2006, ISER.

[141]  Oliver Kroemer,et al.  A kernel-based approach to direct action perception , 2012, 2012 IEEE International Conference on Robotics and Automation.

[142]  Darwin G. Caldwell,et al.  Imitation Learning of Positional and Force Skills Demonstrated via Kinesthetic Teaching and Haptic Input , 2011, Adv. Robotics.

[143]  George Konidaris,et al.  Value Function Approximation in Reinforcement Learning Using the Fourier Basis , 2011, AAAI.

[144]  Oliver Brock,et al.  Opening a lockbox through physical exploration , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[145]  Qiang Liu,et al.  Learning to Explore with Meta-Policy Gradient , 2018, ICML 2018.

[146]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[147]  Stefanie Tellex,et al.  Learning to Generalize Kinematic Models to Novel Objects , 2019, CoRL.

[148]  Oliver Kroemer,et al.  Combining active learning and reactive control for robot grasping , 2010, Robotics Auton. Syst..

[149]  Nando de Freitas,et al.  Active Policy Learning for Robot Planning and Exploration under Uncertainty , 2007, Robotics: Science and Systems.

[150]  C. Kemp,et al.  Robot Manipulation of Human Tools : Autonomous Detection and Control of Task Relevant Features , 2006 .

[151]  Dieter Fox,et al.  Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects , 2018, CoRL.

[152]  Scott Kuindersma,et al.  Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..

[153]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[154]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[155]  Scott Niekum,et al.  Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences , 2020, ICML.

[156]  Jan Peters,et al.  Probabilistic inference for determining options in reinforcement learning , 2016, Machine Learning.

[157]  Oliver Kroemer,et al.  Learning grasp affordance densities , 2011, Paladyn J. Behav. Robotics.

[158]  Martin A. Riedmiller,et al.  Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.

[159]  Siddhartha S. Srinivasa,et al.  Efficient touch based localization through submodularity , 2012, 2013 IEEE International Conference on Robotics and Automation.

[160]  Charles C. Kemp,et al.  Autonomously learning to visually detect where manipulation will succeed , 2012, Auton. Robots.

[161]  Brijen Thananjeyan,et al.  SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards , 2018, Int. J. Robotics Res..

[162]  Jonathan P. How,et al.  Bayesian Nonparametric Inverse Reinforcement Learning , 2012, ECML/PKDD.

[163]  Cynthia Breazeal,et al.  Training a Robot via Human Feedback: A Case Study , 2013, ICSR.

[164]  Jan Peters,et al.  Incremental online sparsification for model learning in real-time robot control , 2011, Neurocomputing.

[165]  Pieter Abbeel,et al.  Learning Plannable Representations with Causal InfoGAN , 2018, NeurIPS.

[166]  Sergey Levine,et al.  One-Shot Hierarchical Imitation Learning of Compound Visuomotor Tasks , 2018, ArXiv.

[167]  Nolan Wagener,et al.  Learning contact-rich manipulation skills with guided policy search , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[168]  Peter Englert,et al.  Kinematic Morphing Networks for Manipulation Skill Transfer , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[169]  Jan Peters,et al.  Stabilizing novel objects by learning to predict tactile slip , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[170]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[171]  Silvia Coradeschi,et al.  A Short Review of Symbol Grounding in Robotic and Intelligent Systems , 2013, KI - Künstliche Intelligenz.

[172]  John Kenneth Salisbury,et al.  Learning to represent haptic feedback for partially-observable tasks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[173]  Seth J. Teller,et al.  Articulated pose estimation using tangent space approximations , 2016, Int. J. Robotics Res..

[174]  Tom Schaul,et al.  FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.

[175]  Peter Stone,et al.  Behavioral Cloning from Observation , 2018, IJCAI.

[176]  Sergey Levine,et al.  Few-Shot Goal Inference for Visuomotor Learning and Planning , 2018, CoRL.

[177]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[178]  Danica Kragic,et al.  Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object-Action complexes , 2008, Int. J. Humanoid Robotics.

[179]  Elman Mansimov,et al.  Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation , 2017, NIPS.

[180]  Jan Peters,et al.  Movement extraction by detecting dynamics switches and repetitions , 2010, NIPS.

[181]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[182]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[183]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[184]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[185]  Leslie Pack Kaelbling,et al.  Belief space planning assuming maximum likelihood observations , 2010, Robotics: Science and Systems.

[186]  George Konidaris,et al.  Hybrid Bayesian Eigenobjects: Combining Linear Subspace and Deep Network Methods for 3D Robot Vision , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[187]  Philip S. Thomas,et al.  Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.

[188]  Richard L. Lewis,et al.  Internal Rewards Mitigate Agent Boundedness , 2010, ICML.

[189]  Ross A. Knepper,et al.  Recovering from failure by asking for help , 2015, Auton. Robots.

[190]  Stefan Schaal,et al.  Towards Associative Skill Memories , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[191]  Connor Schenck,et al.  Which Object Comes Next? Grounded Order Completion by a Humanoid Robot , 2012 .

[192]  Nan Jiang,et al.  Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.

[193]  Ashutosh Saxena,et al.  Deep multimodal embedding: Manipulating novel objects with point-clouds, language and trajectories , 2015, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[194]  Marc Toussaint,et al.  Learning Grounded Relational Symbols from Continuous Data for Abstract Reasoning , 2013 .

[195]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[196]  Sehoon Ha,et al.  Expanding Motor Skills using Relay Networks , 2018, CoRL.

[197]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[198]  Bradley Hayes,et al.  Robust Robot Learning from Demonstration and Skill Repair Using Conceptual Constraints , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[199]  Leslie Pack Kaelbling,et al.  Symbol Acquisition for Probabilistic High-Level Planning , 2015, IJCAI.

[200]  Maya Cakmak,et al.  To Afford or Not to Afford: A New Formalization of Affordances Toward Affordance-Based Robot Control , 2007, Adapt. Behav..

[201]  Pieter Abbeel,et al.  Reverse Curriculum Generation for Reinforcement Learning , 2017, CoRL.

[202]  Gaurav S. Sukhatme,et al.  Multi-Modal Imitation Learning from Unstructured Demonstrations using Generative Adversarial Nets , 2017, NIPS.

[203]  Sergey Levine,et al.  Meta-Reinforcement Learning of Structured Exploration Strategies , 2018, NeurIPS.

[204]  Michael A. Arbib,et al.  Modeling parietal-premotor interactions in primate control of grasping , 1998, Neural Networks.

[205]  Moritz Tenorth,et al.  Decomposing CAD models of objects of daily use and reasoning about their functional parts , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[206]  Oliver Brock,et al.  A Factorization Approach to Manipulation in Unstructured Environments , 2009, ISRR.

[207]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[208]  Pieter Abbeel,et al.  Value Iteration Networks , 2016, NIPS.

[209]  Danica Kragic,et al.  Multivariate discretization for Bayesian Network structure learning in robot grasping , 2011, 2011 IEEE International Conference on Robotics and Automation.

[210]  N. Nilsson STUART RUSSELL AND PETER NORVIG, ARTIFICIAL INTELLIGENCE: A MODERN APPROACH , 1996 .

[211]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[212]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[213]  Carl E. Rasmussen,et al.  PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[214]  Máximo A. Roa,et al.  Transferring functional grasps through contact warping and local replanning , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[215]  Connor Schenck,et al.  SPNets: Differentiable Fluid Dynamics for Deep Neural Networks , 2018, CoRL.

[216]  Sergey Levine,et al.  Learning Robust Rewards with Adversarial Inverse Reinforcement Learning , 2017, ICLR 2017.

[217]  Michael Beetz,et al.  Learning models for constraint-based motion parameterization from interactive physics-based simulation , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[218]  Scott Niekum,et al.  Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning , 2017, AAAI.

[219]  Sergey Levine,et al.  Deep Object-Centric Representations for Generalizable Robot Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[220]  Sergey Levine,et al.  Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[221]  Scott Niekum,et al.  Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[222]  Lee Spector,et al.  Genetic Programming for Reward Function Search , 2010, IEEE Transactions on Autonomous Mental Development.

[223]  Marc Toussaint,et al.  Direct Loss Minimization Inverse Optimal Control , 2015, Robotics: Science and Systems.

[224]  Danica Kragic,et al.  A probabilistic framework for task-oriented grasp stability assessment , 2013, 2013 IEEE International Conference on Robotics and Automation.

[225]  Oliver Brock,et al.  An integrated approach to visual perception of articulated objects , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[226]  Christopher G. Atkeson,et al.  Online Bayesian changepoint detection for articulated motion models , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[227]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[228]  Andrea Lockerd Thomaz,et al.  Human-Driven Feature Selection for a Robotic Agent Learning Classification Tasks from Demonstration , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[229]  Sergey Levine,et al.  Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[230]  Yasemin Altun,et al.  Relative Entropy Policy Search , 2010 .

[231]  Matei T. Ciocarlie,et al.  Dimensionality reduction for hand-independent dexterous robotic grasping , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[232]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[233]  Oussama Khatib,et al.  Global Localization of Objects via Touch , 2011, IEEE Transactions on Robotics.

[234]  Stefan Schaal,et al.  Skill learning and task outcome prediction for manipulation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[235]  Dirk Kraft,et al.  Learning spatial relationships from 3D vision using histograms , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[236]  Peter K. Allen,et al.  Recognition of deformable object category and pose , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[237]  Stefan Schaal,et al.  A Generalized Path Integral Control Approach to Reinforcement Learning , 2010, J. Mach. Learn. Res..

[238]  Shie Mannor,et al.  Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach , 2015, NIPS.

[239]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[240]  Leslie Pack Kaelbling,et al.  Modular meta-learning , 2018, CoRL.

[241]  Pierre-Yves Oudeyer,et al.  Incremental local online Gaussian Mixture Regression for imitation learning of multiple tasks , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[242]  Andrew G. Barto,et al.  Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining , 2009, NIPS.

[243]  Tom Schaul,et al.  Universal Value Function Approximators , 2015, ICML.

[244]  Robert D. Howe,et al.  Contact State Estimation Using Multiple Model Estimation and Hidden Markov Models , 2004, Int. J. Robotics Res..

[245]  Wolfram Burgard,et al.  A Probabilistic Framework for Learning Kinematic Models of Articulated Objects , 2011, J. Artif. Intell. Res..

[246]  Christopher G. Atkeson,et al.  Differential dynamic programming with temporally decomposed dynamics , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[247]  Christopher G. Atkeson,et al.  Combining finger vision and optical tactile sensing: Reducing and handling errors while cutting vegetables , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[248]  Peter Englert,et al.  Inverse KKT: Learning cost functions of manipulation tasks from demonstrations , 2017, ISRR.

[249]  David M. Bradley,et al.  Boosting Structured Prediction for Imitation Learning , 2006, NIPS.

[250]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[251]  Moritz Tenorth,et al.  RoboEarth - A World Wide Web for Robots , 2011, ICRA 2011.

[252]  Leonidas J. Guibas,et al.  Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[253]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[254]  M. Spong,et al.  Robot Modeling and Control , 2005 .

[255]  Maja J. Mataric,et al.  Performance-Derived Behavior Vocabularies: Data-Driven Acquisition of Skills from Motion , 2004, Int. J. Humanoid Robotics.

[256]  Danica Kragic,et al.  ST-HMP: Unsupervised Spatio-Temporal feature learning for tactile data , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[257]  Mohamed Sultan Mohamed Ali,et al.  Development of a shape-memory-alloy micromanipulator based on integrated bimorph microactuators , 2016 .

[258]  Kiho Kim,et al.  Robotic contamination cleaning system , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[259]  Ugo Pattacini,et al.  Heteroscedastic Regression and Active Learning for Modeling Affordances in Humanoids , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[260]  Scott Niekum,et al.  Learning Multi-Step Robotic Tasks from Observation , 2018, ArXiv.

[261]  Scott Niekum,et al.  Clustering via Dirichlet Process Mixture Models for Portable Skill Discovery , 2011, Lifelong Learning.

[262]  Sergey Levine,et al.  Learning Robotic Manipulation of Granular Media , 2017, CoRL.

[263]  Xiao Huang,et al.  Novelty and Reinforcement Learning in the Value System of Developmental Robots , 2002 .

[264]  Daniela Rus,et al.  Learning Object Grasping for Soft Robot Hands , 2018, IEEE Robotics and Automation Letters.

[265]  Brian Scassellati,et al.  Autonomously constructing hierarchical task networks for planning and human-robot collaboration , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[266]  Stefan Schaal,et al.  Probabilistic Articulated Real-Time Tracking for Robot Manipulation , 2016, IEEE Robotics and Automation Letters.

[267]  Peter I. Corke,et al.  Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control , 2015, ICRA 2015.

[268]  Artur Arsenio,et al.  Reinforcing robot perception of multi-modal events through repetition and redundancy and repetition and redundancy , 2006 .

[269]  Tom Schaul,et al.  Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.

[270]  Mike Stilman,et al.  Combining motion planning and optimization for flexible robot manipulation , 2010, 2010 10th IEEE-RAS International Conference on Humanoid Robots.

[271]  Thierry Siméon,et al.  Manipulation Planning with Probabilistic Roadmaps , 2004, Int. J. Robotics Res..

[272]  Jan Peters,et al.  Probabilistic Movement Primitives , 2013, NIPS.

[273]  Larry H. Matthies,et al.  Task-oriented grasping with semantic and geometric scene understanding , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[274]  Jan Peters,et al.  Hierarchical Relative Entropy Policy Search , 2014, AISTATS.

[275]  Sergey Levine,et al.  Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.

[276]  George Konidaris,et al.  Learning Symbolic Representations for Planning with Parameterized Skills , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[277]  Danica Kragic,et al.  A Sensorimotor Learning Framework for Object Categorization , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[278]  Tamim Asfour,et al.  Learn to wipe: A case study of structural bootstrapping from sensorimotor experience , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[279]  Leslie Pack Kaelbling,et al.  Active Model Learning and Diverse Action Sampling for Task and Motion Planning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[280]  Eyal Amir,et al.  Bayesian Inverse Reinforcement Learning , 2007, IJCAI.

[281]  Andrew W. Moore,et al.  Locally Weighted Learning , 1997, Artificial Intelligence Review.

[282]  Peter K. Allen,et al.  Learning grasp stability , 2012, 2012 IEEE International Conference on Robotics and Automation.

[283]  Kuan-Ting Yu,et al.  More than a million ways to be pushed. A high-fidelity experimental dataset of planar pushing , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[284]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[285]  Timothy Bretl,et al.  Maximum entropy inverse reinforcement learning in continuous state spaces with path integrals , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[286]  Ufuk Topcu,et al.  Safe Reinforcement Learning via Shielding , 2017, AAAI.

[287]  Yuval Tassa,et al.  Control-limited differential dynamic programming , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[288]  Carl E. Rasmussen,et al.  Gaussian Processes for Data-Efficient Learning in Robotics and Control , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[289]  Kee-Eung Kim,et al.  Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions , 2012, NIPS.

[290]  Scott Niekum,et al.  Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations , 2019, CoRL.

[291]  J. Andrew Bagnell,et al.  Modeling Purposeful Adaptive Behavior with the Principle of Maximum Causal Entropy , 2010 .

[292]  Peter Stone,et al.  Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation , 2016, AAAI.

[293]  Silvio Savarese,et al.  Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[294]  James L. McClelland,et al.  Autonomous Mental Development by Robots and Animals , 2001, Science.

[295]  George J. Pappas,et al.  Temporal logic motion planning for dynamic robots , 2009, Autom..

[296]  Sergey Levine,et al.  End-to-End Learning of Semantic Grasping , 2017, CoRL.

[297]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[298]  Ales Leonardis,et al.  One-shot learning and generation of dexterous grasps for novel objects , 2016, Int. J. Robotics Res..

[299]  Oliver Brock,et al.  Learning state representations with robotic priors , 2015, Auton. Robots.

[300]  Gillian M. Hayes,et al.  A Robot Controller Using Learning by Imitation , 1994 .

[301]  Mi-Ching Tsai,et al.  Robust and Optimal Control , 2014 .

[302]  Michael R. James,et al.  Predictive State Representations: A New Theory for Modeling Dynamical Systems , 2004, UAI.

[303]  Danica Kragic,et al.  Learning grasp stability based on tactile data and HMMs , 2010, 19th International Symposium in Robot and Human Interactive Communication.

[304]  Peter K. Allen,et al.  Grasp adjustment on novel objects using tactile experience from similar local geometry , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[305]  Leslie Pack Kaelbling,et al.  Task-Driven Tactile Exploration , 2010, Robotics: Science and Systems.

[306]  Mark Steedman,et al.  Object-Action Complexes: Grounded abstractions of sensory-motor processes , 2011, Robotics Auton. Syst..

[307]  Dana H. Ballard,et al.  Task Frames in Robot Manipulation , 1984, AAAI.

[308]  Nan Jiang,et al.  Abstraction Selection in Model-based Reinforcement Learning , 2015, ICML.

[309]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[310]  Yiannis Aloimonos,et al.  Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[311]  Matthew T. Mason,et al.  Compliance and Force Control for Computer Controlled Manipulators , 1981, IEEE Transactions on Systems, Man, and Cybernetics.

[312]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[313]  Byron Boots,et al.  Hilbert Space Embeddings of Predictive State Representations , 2013, UAI.

[314]  Arjun Guha,et al.  Interactive Robot Transition Repair With SMT , 2018, IJCAI.

[315]  Volkan Cevher,et al.  Interactive Teaching Algorithms for Inverse Reinforcement Learning , 2019, IJCAI.

[316]  Michael I. Jordan,et al.  Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning , 2018, ArXiv.

[317]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[318]  Sergey Levine,et al.  Learning Latent Plans from Play , 2019, CoRL.

[319]  Matei T. Ciocarlie,et al.  Accurate contact localization and indentation depth prediction with an optics-based tactile sensor , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[320]  Christopher G. Atkeson,et al.  Stereo vision of liquid and particle flow for robot pouring , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[321]  Christopher G. Atkeson,et al.  Implementing tactile behaviors using FingerVision , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[322]  Oliver Brock,et al.  Interactive Perception: Leveraging Action in Perception and Perception in Action , 2016, IEEE Transactions on Robotics.

[323]  Justus H. Piater,et al.  Bottom-up learning of object categories, action effects and logical rules: From continuous manipulative exploration to symbolic planning , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[324]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[325]  Arkanath Pathak,et al.  Learning 6-DOF Grasping Interaction via Deep Geometry-Aware 3D Representations , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[326]  Gaurav S. Sukhatme,et al.  Learning spatial preconditions of manipulation skills using random forests , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[327]  Oliver Kroemer,et al.  Predicting Grasp Success with a Soft Sensing Skin and Shape-Memory Actuated Gripper , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[328]  Moritz Tenorth,et al.  CRAM — A Cognitive Robot Abstract Machine for everyday manipulation in human environments , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[329]  Kenneth O. Stanley,et al.  Go-Explore: a New Approach for Hard-Exploration Problems , 2019, ArXiv.

[330]  Alexander Stoytchev,et al.  Learning to slide a magnetic card through a card reader , 2012, 2012 IEEE International Conference on Robotics and Automation.

[331]  Marek Petrik,et al.  Safe Policy Improvement by Minimizing Robust Baseline Regret , 2016, NIPS.

[332]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[333]  V. Gullapalli,et al.  Acquiring robot skills via reinforcement learning , 1994, IEEE Control Systems.

[334]  Sven Behnke,et al.  Transferring Category-Based Functional Grasping Skills by Latent Space Non-Rigid Registration , 2018, IEEE Robotics and Automation Letters.

[335]  Sven Behnke,et al.  RGB-D object detection and semantic segmentation for autonomous manipulation in clutter , 2018, Int. J. Robotics Res..

[336]  Matei T. Ciocarlie,et al.  Collaborative grasp planning with multiple object representations , 2011, 2011 IEEE International Conference on Robotics and Automation.

[337]  Jan Peters,et al.  Probabilistic segmentation applied to an assembly task , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[338]  Scott Niekum,et al.  Learning Hybrid Object Kinematics for Efficient Hierarchical Planning Under Uncertainty , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[339]  Jitendra Malik,et al.  More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch , 2018, IEEE Robotics and Automation Letters.

[340]  Jiajun Wu,et al.  Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids , 2018, ICLR.

[341]  Pieter Abbeel,et al.  Evolved Policy Gradients , 2018, NeurIPS.

[342]  Alexander Stoytchev,et al.  Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[343]  Stephen Hart,et al.  An intrinsic reward for affordance exploration , 2009, 2009 IEEE 8th International Conference on Development and Learning.

[344]  Scott Niekum,et al.  Learning grounded finite-state representations from unstructured demonstrations , 2015, Int. J. Robotics Res..

[345]  Daniel Kappler,et al.  Riemannian Motion Policies , 2018, ArXiv.

[346]  Marc Toussaint,et al.  Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning , 2018, Robotics: Science and Systems.

[347]  Philip S. Thomas,et al.  High Confidence Policy Improvement , 2015, ICML.

[348]  Jörg Stückler,et al.  Adaptive tool-use strategies for anthropomorphic service robots , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[349]  Maya Cakmak,et al.  From primitive behaviors to goal-directed behavior using affordances , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[350]  Siddhartha S. Srinivasa,et al.  The manifold particle filter for state estimation on high-dimensional implicit manifolds , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[351]  Jiajun Wu,et al.  Galileo: Perceiving Physical Object Properties by Integrating a Physics Engine with Deep Learning , 2015, NIPS.

[352]  Jivko Sinapov,et al.  Vibrotactile Recognition and Categorization of Surfaces by a Humanoid Robot , 2011, IEEE Transactions on Robotics.

[353]  Leslie Pack Kaelbling,et al.  FFRob: Leveraging symbolic planning for efficient task and motion planning , 2016, Int. J. Robotics Res..

[354]  Daniel H. Grollman,et al.  Incremental learning of subtasks from unsegmented demonstration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[355]  Connor Schenck,et al.  Interactive object recognition using proprioceptive and auditory feedback , 2011, Int. J. Robotics Res..

[356]  Jun Nakanishi,et al.  Learning Movement Primitives , 2005, ISRR.

[357]  Sergey Levine,et al.  Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[358]  Pieter Abbeel,et al.  Third-Person Imitation Learning , 2017, ICLR.

[359]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[360]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[361]  Shie Mannor,et al.  A Tutorial on the Cross-Entropy Method , 2005, Ann. Oper. Res..

[362]  Thomas Deselaers,et al.  Measuring the Objectness of Image Windows , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[363]  Ufuk Topcu,et al.  Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints , 2014, Robotics: Science and Systems.

[364]  Chong Li,et al.  Model-Free Reinforcement Learning , 2019, Reinforcement Learning for Cyber-Physical Systems.

[365]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[366]  Benjamin Kuipers,et al.  Autonomous Learning of High-Level States and Actions in Continuous Environments , 2012, IEEE Transactions on Autonomous Mental Development.

[367]  Leslie Pack Kaelbling,et al.  Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[368]  Silvio Savarese,et al.  Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[369]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[370]  Jiajun Wu,et al.  Physics 101: Learning Physical Object Properties from Unlabeled Videos , 2016, BMVC.

[371]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[372]  Leslie Pack Kaelbling,et al.  Augmenting Physical Simulators with Stochastic Neural Networks: Case Study of Planar Pushing and Bouncing , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[373]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[374]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[375]  Pieter Abbeel,et al.  Active exploration using trajectory optimization for robotic grasping in the presence of occlusions , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[376]  Charles C. Kemp,et al.  Material Recognition from Heat Transfer given Varying Initial Conditions and Short-Duration Contact , 2015, Robotics: Science and Systems.

[377]  Stefan Schaal,et al.  Learning variable impedance control , 2011, Int. J. Robotics Res..

[378]  Peter Stone,et al.  Learning Multi-Modal Grounded Linguistic Semantics by Playing "I Spy" , 2016, IJCAI.

[379]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[380]  Ashutosh Saxena,et al.  Robobarista: Object Part Based Transfer of Manipulation Trajectories from Crowd-Sourcing in 3D Pointclouds , 2015, ISRR.

[381]  George Konidaris,et al.  Constructing Abstraction Hierarchies Using a Skill-Symbol Loop , 2015, IJCAI.

[382]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[383]  Manuel Lopes,et al.  Temporal segmentation of pair-wise interaction phases in sequential manipulation demonstrations , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[384]  Pierre-Yves Oudeyer,et al.  What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.

[385]  Andrew G. Barto,et al.  Building Portable Options: Skill Transfer in Reinforcement Learning , 2007, IJCAI.

[386]  Marc Toussaint,et al.  Active exploration of joint dependency structures , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[387]  Yi Li,et al.  Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[388]  Christopher Burgess,et al.  DARLA: Improving Zero-Shot Transfer in Reinforcement Learning , 2017, ICML.

[389]  Taolue Chen,et al.  Synthesis for Multi-objective Stochastic Games: An Application to Autonomous Urban Driving , 2013, QEST.

[390]  Shie Mannor,et al.  The Cross Entropy Method for Fast Policy Search , 2003, ICML.

[391]  Manuel Lopes,et al.  Active Learning for Reward Estimation in Inverse Reinforcement Learning , 2009, ECML/PKDD.

[392]  Yilun Zhou,et al.  Representing, learning, and controlling complex object interactions , 2018, Autonomous Robots.

[393]  Alberto Rodriguez,et al.  Failure detection in assembly: Force signature analysis , 2010, 2010 IEEE International Conference on Automation Science and Engineering.

[394]  Danica Kragic,et al.  Learning a dictionary of prototypical grasp-predicting parts from grasping experience , 2013, 2013 IEEE International Conference on Robotics and Automation.

[395]  Marc Toussaint,et al.  Uncertainty aware grasping and tactile exploration , 2013, 2013 IEEE International Conference on Robotics and Automation.

[396]  John F. Canny,et al.  Robot Bed-Making: Deep Transfer Learning Using Depth Sensing of Deformable Fabric , 2018, ArXiv.

[397]  Lawrence. Davis,et al.  Handbook Of Genetic Algorithms , 1990 .

[398]  Scott Niekum,et al.  Viewpoint selection for visual failure detection , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[399]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[400]  Naoyuki Kubota,et al.  Reinforcement Learning in non-stationary environments: An intrinsically motivated stress based memory retrieval performance (SBMRP) model , 2014, 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[401]  Pravesh Ranchod,et al.  Nonparametric Bayesian reward segmentation for skill discovery using inverse reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[402]  Dieter Fox,et al.  DART: dense articulated real-time tracking with consumer depth cameras , 2015, Auton. Robots.

[403]  S. Levine,et al.  Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks , 2019, IEEE Robotics and Automation Letters.

[404]  Russ Tedrake,et al.  Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation , 2018, CoRL.

[405]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[406]  Marcin Andrychowicz,et al.  One-Shot Imitation Learning , 2017, NIPS.

[407]  Wolfram Burgard,et al.  Tactile Sensing for Mobile Manipulation , 2011, IEEE Transactions on Robotics.

[408]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[409]  Oliver Kroemer,et al.  Towards learning hierarchical skills for multi-phase manipulation tasks , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[410]  Pieter Abbeel,et al.  Learning from Demonstrations Through the Use of Non-rigid Registration , 2013, ISRR.

[411]  Danica Kragic,et al.  What's in the container? Classifying object contents from vision and touch , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[412]  Prabhat Nagarajan,et al.  Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations , 2019, ICML.

[413]  Oliver Kroemer,et al.  Probabilistic Segmentation and Targeted Exploration of Objects in Cluttered Environments , 2014, IEEE Transactions on Robotics.

[414]  Wolfram Burgard,et al.  Operating articulated objects based on experience , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[415]  Ken Goldberg,et al.  Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[416]  Jan Peters,et al.  Grip Stabilization of Novel Objects Using Slip Prediction , 2018, IEEE Transactions on Haptics.

[417]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[418]  Peter K. Allen,et al.  Robot learning of everyday object manipulations via human demonstration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[419]  Sergey Levine,et al.  Nonlinear Inverse Reinforcement Learning with Gaussian Processes , 2011, NIPS.

[420]  Siddhartha S. Srinivasa,et al.  Unsupervised Learning for Nonlinear PieceWise Smooth Hybrid Systems , 2017, ArXiv.

[421]  Sergey Levine,et al.  Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization , 2016, ICML.

[422]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[423]  Peter Stone,et al.  Stochastic Grounded Action Transformation for Robot Learning in Simulation , 2017, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[424]  Oliver Kroemer,et al.  Generalizing pouring actions between objects using warped parameters , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[425]  Emanuel Todorov,et al.  Inverse Optimal Control with Linearly-Solvable MDPs , 2010, ICML.

[426]  Carl E. Rasmussen,et al.  Policy search for learning robot control using sparse data , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[427]  Marc Toussaint,et al.  Exploration in relational domains for model-based reinforcement learning , 2012, J. Mach. Learn. Res..

[428]  Maria Bauza,et al.  A probabilistic data-driven model for planar pushing , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[429]  Lihong Li,et al.  The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning , 2009, ICML '09.

[430]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[431]  Charles C. Kemp,et al.  Multimodal execution monitoring for anomaly detection during robot manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[432]  Manuel Lopes,et al.  Active learning of visual descriptors for grasping using non-parametric smoothed beta distributions , 2012, Robotics Auton. Syst..

[433]  Scott Niekum,et al.  One-Shot Learning of Multi-Step Tasks from Observation via Activity Localization in Auxiliary Video , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[434]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[435]  Peter Stone,et al.  Automatic Curriculum Graph Generation for Reinforcement Learning Agents , 2017, AAAI.

[436]  Oliver Brock,et al.  Interactive segmentation for manipulation in unstructured environments , 2009, 2009 IEEE International Conference on Robotics and Automation.

[437]  Benjamin Kuipers,et al.  Learning to Grasp by Extending the Peri-Personal Space Graph , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[438]  Tucker Hermans,et al.  Modeling Grasp Type Improves Learning-Based Grasp Planning , 2019, IEEE Robotics and Automation Letters.

[439]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[440]  Michael L. Littman,et al.  Apprenticeship Learning About Multiple Intentions , 2011, ICML.

[441]  Allan Jabri,et al.  Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control , 2018, ICML.

[442]  Tamim Asfour,et al.  The sense of surface orientation — A new sensor modality for humanoid robots , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[443]  Yuchen Cui,et al.  Risk-Aware Active Inverse Reinforcement Learning , 2018, CoRL.

[444]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[445]  Gaurav S. Sukhatme,et al.  Using Manipulation Primitives for Object Sorting in Cluttered Environments , 2015, IEEE Transactions on Automation Science and Engineering.

[446]  Stuart C. Shapiro,et al.  Anchoring in a grounded layered architecture with integrated reasoning , 2003, Robotics Auton. Syst..

[447]  Angelo Cangelosi,et al.  Affordances in Psychology, Neuroscience, and Robotics: A Survey , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[448]  Ufuk Topcu,et al.  Environment-Independent Task Specifications via GLTL , 2017, ArXiv.

[449]  Marc Toussaint,et al.  POMDP manipulation via trajectory optimization , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[450]  Wolfram Burgard,et al.  Optimization Beyond the Convolution: Generalizing Spatial Relations with End-to-End Metric Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[451]  Oliver Brock,et al.  A Framework for Learning and Control in Intelligent Humanoid Robots , 2005, Int. J. Humanoid Robotics.

[452]  Jan Peters,et al.  Reinforcement Learning to Adjust Robot Movements to New Situations , 2010, IJCAI.

[453]  Yang Gao,et al.  Deep learning for tactile understanding from visual and haptic data , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[454]  Matei T. Ciocarlie,et al.  Towards Reliable Grasping and Manipulation in Household Environments , 2010, ISER.

[455]  Scott Kuindersma,et al.  Autonomous Skill Acquisition on a Mobile Manipulator , 2011, AAAI.

[456]  Richard S. Sutton,et al.  Predictive Representations of State , 2001, NIPS.

[457]  Alexander Herzog,et al.  Learning of grasp selection based on shape-templates , 2014, Auton. Robots.

[458]  Peter Englert,et al.  Learning manipulation skills from a single demonstration , 2018, Int. J. Robotics Res..

[459]  C. Kemp,et al.  Toward Robot Learning of Tool Manipulation from Human Demonstration , 2006 .

[460]  Stefan Schaal,et al.  Data-Driven Online Decision Making for Autonomous Manipulation , 2015, Robotics: Science and Systems.

[461]  Satoshi Kagami,et al.  Learning object models for whole body manipulation , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[462]  Sylvain Calinon,et al.  Robot Learning with Task-Parameterized Generative Models , 2015, ISRR.

[463]  Philip S. Thomas,et al.  High-Confidence Off-Policy Evaluation , 2015, AAAI.

[464]  Benjamin Rosman,et al.  Learning spatial relationships between objects , 2011, Int. J. Robotics Res..

[465]  Andrea Lockerd Thomaz,et al.  Abstraction from demonstration for efficient reinforcement learning in high-dimensional domains , 2014, Artif. Intell..

[466]  Emre Ugur,et al.  Predicting future object states using learned affordances , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[467]  Matthew R. Walter,et al.  Learning Articulated Motions From Visual Demonstration , 2014, Robotics: Science and Systems.

[468]  Manuel Lopes,et al.  Active Learning for Teaching a Robot Grounded Relational Symbols , 2013, IJCAI.

[469]  Oliver Kroemer,et al.  A kernel-based approach to learning contact distributions for robot manipulation tasks , 2018, Auton. Robots.

[470]  Ken Chen,et al.  Learning-based Variable Compliance Control for Robotic Assembly , 2018 .

[471]  Leslie Pack Kaelbling,et al.  From Skills to Symbols: Learning Symbolic Representations for Abstract High-Level Planning , 2018, J. Artif. Intell. Res..

[472]  Herman Bruyninckx,et al.  Bayesian time-series models for continuous fault detection and recognition in industrial robotic tasks , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[473]  Oliver Kroemer,et al.  Structured Apprenticeship Learning , 2012, ECML/PKDD.