论文信息 - Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks

Tool-Use Model to Reproduce the Goal Situations Considering Relationship Among Tools, Objects, Actions and Effects Using Multimodal Deep Neural Networks

We propose a tool-use model that enables a robot to act toward a provided goal. It is important to consider features of the four factors; tools, objects actions, and effects at the same time because they are related to each other and one factor can influence the others. The tool-use model is constructed with deep neural networks (DNNs) using multimodal sensorimotor data; image, force, and joint angle information. To allow the robot to learn tool-use, we collect training data by controlling the robot to perform various object operations using several tools with multiple actions that leads different effects. Then the tool-use model is thereby trained and learns sensorimotor coordination and acquires relationships among tools, objects, actions and effects in its latent space. We can give the robot a task goal by providing an image showing the target placement and orientation of the object. Using the goal image with the tool-use model, the robot detects the features of tools and objects, and determines how to act to reproduce the target effects automatically. Then the robot generates actions adjusting to the real time situations even though the tools and objects are unknown and more complicated than trained ones.

[1] Gordon Cheng,et al. Robust semantic representations for inferring human co-manipulation activities even with different demonstration styles , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[2] F. Osiurak,et al. Grasping the affordances, understanding the reasoning: toward a dialectical theory of human tool use. , 2010, Psychological review.

[3] Solly Brown,et al. A Relational Approach to Tool-Use Learning in Robots , 2012, ILP.

[4] M. Turvey. Affordances and Prospective Control: An Outline of the Ontology , 1992 .

[5] J. Lockman. A perception--action perspective on tool use development. , 2000, Child development.

[6] Makoto Shimojo,et al. Recognition of virtual shape using visual and tactual sense under optical illusion , 1994, Proceedings of 1994 3rd IEEE International Workshop on Robot and Human Communication.

[7] Yiannis Aloimonos,et al. Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[8] Jacqueline Fagard,et al. The emergence of tool use during the second year of life. , 2012, Journal of experimental child psychology.

[9] Shigeki Sugano,et al. Detecting Features of Tools, Objects, and Actions from Effects in a Robot using Deep Learning , 2018, 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob).

[10] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .

[11] Angelo Cangelosi,et al. Affordances in Psychology, Neuroscience, and Robotics: A Survey , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[12] Jun Tani,et al. Emergence of Functional Hierarchy in a Multiple Timescale Neural Network Model: A Humanoid Robot Experiment , 2008, PLoS Comput. Biol..

[13] Kimitoshi Yamazaki,et al. Home-Assistant Robot for an Aging Society , 2012, Proceedings of the IEEE.

[14] Alexander Stoytchev,et al. Behavior-Grounded Representation of Tool Affordances , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[15] Fuchun Sun,et al. Visual–Tactile Fusion for Object Recognition , 2017, IEEE Transactions on Automation Science and Engineering.

[16] R. Milo,et al. Tools, language and cognition in human evolution , 1995, International Journal of Primatology.

[17] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[18] M. Ernst,et al. Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[19] Jürgen Schmidhuber,et al. Stacked Convolutional Auto-Encoders for Hierarchical Feature Extraction , 2011, ICANN.

[20] Shigeki Sugano,et al. Tool-body assimilation model considering grasping motion through deep learning , 2017, Robotics Auton. Syst..

[21] Shigeki Sugano,et al. Tool-Use Model Considering Tool Selection by a Robot Using Deep Learning , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[22] James J. Gibson,et al. The Ecological Approach to Visual Perception: Classic Edition , 2014 .

[23] Shigeki Sugano,et al. Repeatable Folding Task by Humanoid Robot Worker Using Deep Learning , 2017, IEEE Robotics and Automation Letters.

[24] Dejan Pangercic,et al. Robotic roommates making pancakes , 2011, 2011 11th IEEE-RAS International Conference on Humanoid Robots.

[25] Alessandro Saffiotti,et al. Development of a Socially Believable Multi-Robot Solution from Town to Home , 2014, Cognitive Computation.

[26] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[27] M. Lungarella,et al. Towards a model for tool-body assimilation and adaptive tool-use , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[28] A. Chemero. An Outline of a Theory of Affordances , 2003, How Shall Affordances be Refined? Four Perspectives.

[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30] E. Bushnell,et al. Motor development and the mind: the potential role of motor abilities as a determinant of aspects of perceptual development. , 1993, Child development.

[31] Kimitoshi Yamazaki,et al. Manipulation of multiple objects in close proximity based on visual hierarchical relationships , 2013, 2013 IEEE International Conference on Robotics and Automation.

[32] Song-Chun Zhu,et al. Understanding tools: Task-oriented object modeling, learning and recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Atabak Dehban,et al. Denoising auto-encoders for learning of objects and tools affordances in continuous space , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[34] R. Amant,et al. Affordances for robots: a brief survey , 2012 .

[35] Lorenzo Natale,et al. What Can I Do With This Tool? Self-Supervised Learning of Tool Affordances From Their 3-D Geometry , 2018, IEEE Transactions on Cognitive and Developmental Systems.

[36] Alexandre Bernardino,et al. Learning intermediate object affordances: Towards the development of a tool concept , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[37] Tetsuya Ogata,et al. Tool–Body Assimilation of Humanoid Robot Using a Neurodynamical System , 2012, IEEE Transactions on Autonomous Mental Development.

[38] Jin-Hui Zhu,et al. Affordance Research in Developmental Robotics: A Survey , 2016, IEEE Transactions on Cognitive and Developmental Systems.

[39] T. Stoffregen. Affordances as Properties of the Animal-Environment System , 2003, How Shall Affordances be Refined? Four Perspectives.

[40] Shigeki Sugano,et al. How to Select and Use Tools? : Active Perception of Target Objects Using Multimodal Deep Learning , 2021, IEEE Robotics and Automation Letters.

[41] Colin Potts,et al. Design of Everyday Things , 1988 .