Perceiving and reasoning about liquids using fully convolutional networks

Liquids are an important part of many common manipulation tasks in human environments. If we wish to have robots that can accomplish these types of tasks, they must be able to interact with liquids in an intelligent manner. In this paper, we investigate ways for robots to perceive and reason about liquids. That is, a robot asks the questions What in the visual data stream is liquid? and How can I use that to infer all the potential places where liquid might be? We collected two data sets to evaluate these questions, one using a realistic liquid simulator and another using our robot. We used fully convolutional neural networks to learn to detect and track liquids across pouring sequences. Our results show that these networks are able to perceive and reason about liquids, and that integrating temporal information is important to performing such tasks well.

[1]  Maya Cakmak,et al.  Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[2]  Connor Schenck,et al.  Towards Learning to Perceive and Reason About Liquids , 2016, ISER.

[3]  Robert Bridson,et al.  Fluid Simulation for Computer Graphics , 2008 .

[4]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[5]  Andrew W. Fitzgibbon,et al.  Efficient regression of general-activity human poses from depth images , 2011, 2011 International Conference on Computer Vision.

[6]  Sergey Levine,et al.  Adapting Deep Visuomotor Representations with Weak Pairwise Constraints , 2015, WAFR.

[7]  Ali Farhadi,et al.  Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Christopher G. Atkeson,et al.  Differential dynamic programming for graph-structured dynamical systems: Generalization of pouring behavior with different skills , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[9]  Ali Farhadi,et al.  "What Happens If..." Learning to Predict the Effect of Forces in Images , 2016, ECCV.

[10]  Joshua B. Tenenbaum,et al.  Humans predict liquid dynamics using probabilistic simulation , 2015, CogSci.

[11]  Michael Beetz,et al.  Envisioning the qualitative effects of robot manipulation actions using simulation-based projections , 2017, Artif. Intell..

[12]  LeCunYann,et al.  Learning Hierarchical Features for Scene Labeling , 2013 .

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Ulrich Rüde,et al.  Parallel Lattice Boltzmann Methods for CFD Applications , 2006 .

[15]  Susan J. Hespos,et al.  Five-Month-Old Infants Have General Knowledge of How Nonsolid Substances Behave and Interact , 2016, Psychological science.

[16]  Masayuki Inaba,et al.  Vision based behavior verification system of humanoid robot for daily environment tasks , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[17]  Connor Schenck,et al.  Visual closed-loop control for pouring liquids , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[18]  Barbara Solenthaler,et al.  Data-driven fluid simulations using regression forests , 2015, ACM Trans. Graph..

[19]  Susan J. Hespos,et al.  Physics for infants: characterizing the origins of knowledge about objects, substances, and number. , 2012, Wiley interdisciplinary reviews. Cognitive science.

[20]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[21]  Ales Ude,et al.  Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives , 2011, Robotics Auton. Syst..

[22]  Larry H. Matthies,et al.  Daytime water detection based on sky reflections , 2011, 2011 IEEE International Conference on Robotics and Automation.

[23]  Krishnanand N. Kaipa,et al.  Incorporating Failure-to-Success Transitions in Imitation Learning for a Dynamic Pouring Task , 2014 .

[24]  Toby P. Breckon,et al.  On Cross-Spectral Stereo Matching using Dense Gradient Features , 2012, BMVC.

[25]  Connor Schenck,et al.  Reasoning About Liquids via Closed-Loop Simulation , 2017, Robotics: Science and Systems.

[26]  Lars Kunze Naive physics and commonsense reasoning for everyday robot manipulation , 2013 .

[27]  Camille Couprie,et al.  Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Honglak Lee,et al.  Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[31]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Karen Wynn,et al.  Tracking and quantifying objects and non-cohesive substances. , 2011, Developmental science.

[33]  Sergey Levine,et al.  Towards Adapting Deep Visuomotor Representations from Simulated to Real Environments , 2015, ArXiv.

[34]  Aslak Tveito,et al.  Numerical solution of partial differential equations on parallel computers , 2006 .

[35]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Christopher G. Atkeson,et al.  Stereo vision of liquid and particle flow for robot pouring , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[37]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[38]  Wolfram Burgard,et al.  Deep Multispectral Semantic Scene Understanding of Forested Environments Using Multimodal Fusion , 2016, ISER.

[39]  Christopher Joseph Pal,et al.  Brain tumor segmentation with Deep Neural Networks , 2015, Medical Image Anal..

[40]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[41]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[42]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[43]  Christopher G. Atkeson,et al.  Neural networks and differential dynamic programming for reinforcement learning problems , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Carme Torras,et al.  Force-based robot learning of pouring skills using parametric hidden Markov models , 2013, 9th International Workshop on Robot Motion and Control.

[45]  A. Stoytchev,et al.  Object Categorization in the Sink : Learning Behavior – Grounded Object Categories with Water , 2012 .

[46]  Christopher G. Atkeson,et al.  Differential dynamic programming with temporally decomposed dynamics , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[47]  Larry Matthies,et al.  Daytime water detection based on color variation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[48]  Michael J. Watts,et al.  IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS Publication Information , 2020, IEEE Transactions on Neural Networks and Learning Systems.