Towards Learning to Perceive and Reason About Liquids

Recent advances in AI and robotics have claimed many incredible results with deep learning, yet no work to date has applied deep learning to the problem of liquid perception and reasoning. In this paper, we apply fully-convolutional deep neural networks to the tasks of detecting and tracking liquids. We evaluate three models: a single-frame network, multi-frame network, and a LSTM recurrent network. Our results show that the best liquid detection results are achieved when aggregating data over multiple frames and that the LSTM network outperforms the other two in both tasks. This suggests that LSTM-based neural networks have the potential to be a key component for enabling robots to handle liquids using robust, closed-loop controllers.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Aslak Tveito,et al.  Numerical solution of partial differential equations on parallel computers , 2006 .

[3]  Ulrich Rüde,et al.  Parallel Lattice Boltzmann Methods for CFD Applications , 2006 .

[4]  Masayuki Inaba,et al.  Vision based behavior verification system of humanoid robot for daily environment tasks , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[5]  Larry Matthies,et al.  Daytime water detection based on color variation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  Ales Ude,et al.  Learning to pour with a robot arm combining goal and shape learning for dynamic movement primitives , 2011, Robotics Auton. Syst..

[7]  Larry H. Matthies,et al.  Daytime water detection based on sky reflections , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Maya Cakmak,et al.  Designing robot learners that ask good questions , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[9]  Alexander Stoytchev,et al.  Object Categorization in the Sink : Learning Behavior – Grounded Object Categories with Water , 2012 .

[10]  Sergey Levine,et al.  Guided Policy Search , 2013, ICML.

[11]  Carme Torras,et al.  Force-based robot learning of pouring skills using parametric hidden Markov models , 2013, 9th International Workshop on Robot Motion and Control.

[12]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[13]  Krishnanand N. Kaipa,et al.  Incorporating Failure-to-Success Transitions in Imitation Learning for a Dynamic Pouring Task , 2014 .

[14]  Honglak Lee,et al.  Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.

[15]  Trevor Darrell,et al.  Fully convolutional networks for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Christopher G. Atkeson,et al.  Differential dynamic programming with temporally decomposed dynamics , 2015, 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids).

[18]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[19]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[20]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[21]  Christopher Joseph Pal,et al.  Brain tumor segmentation with Deep Neural Networks , 2015, Medical Image Anal..

[22]  Michael Beetz,et al.  Envisioning the qualitative effects of robot manipulation actions using simulation-based projections , 2017, Artif. Intell..

[23]  Jürgen Schmidhuber,et al.  LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.