STReSSD: Sim-To-Real from Sound for Stochastic Dynamics

Sound is an information-rich medium that captures dynamic physical events. This work presents STReSSD, a framework that uses sound to bridge the simulation-to-reality gap for stochastic dynamics, demonstrated for the canonical case of a bouncing ball. A physically-motivated noise model is presented to capture stochastic behavior of the balls upon collision with the environment. A likelihood-free Bayesian inference framework is used to infer the parameters of the noise model, as well as a material property called the coefficient of restitution, from audio observations. The same inference framework and the calibrated stochastic simulator are then used to learn a probabilistic model of ball dynamics. The predictive capabilities of the dynamics model are tested in two robotic experiments. First, open-loop predictions anticipate probabilistic success of bouncing a ball into a cup. The second experiment integrates audio perception with a robotic arm to track and deflect a bouncing ball in real-time. We envision that this work is a step towards integrating audio-based inference for dynamic robotic tasks. Experimental results can be viewed at https://youtu.be/b7pOrgZrArk.

[1]  C. D. Kuglin,et al.  The phase correlation image alignment method , 1975 .

[2]  A. Bernstein Listening to the coefficient of restitution , 1977 .

[3]  Eric Krotkov,et al.  Robotic Perception of Material: Experiments with Shape-Invariant Acoustic Measures of Material Type , 1995, ISER.

[4]  C. Carello,et al.  Perception of Object Length by Sound , 1998 .

[5]  Dinesh K. Pai,et al.  Perception of Material from Contact Sounds , 2000, Presence: Teleoperators & Virtual Environments.

[6]  I. Stensgaard,et al.  Listening to the coefficient of restitution - revisited , 2001 .

[7]  Sebastian Lang,et al.  Providing the basis for human-robot-interaction: a multi-modal attention system for a mobile robot , 2003, ICMI '03.

[8]  M. Grassi Do we hear size or sound? Balls dropped on plates , 2005, Perception & psychophysics.

[9]  Karl Iagnemma,et al.  Vibration-based terrain classification for planetary exploration rovers , 2005, IEEE Transactions on Robotics.

[10]  Stefan Wermter,et al.  Robotic sound-source localisation architecture using cross-correlation and recurrent neural networks , 2009, Neural Networks.

[11]  Christoph H. Lampert,et al.  Learning Dynamic Tactile Sensing With Robust Vision-Based Training , 2011, IEEE Transactions on Robotics.

[12]  Marina Montaine,et al.  Coefficient of restitution as a fluctuating quantity. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[13]  Ole Ravn,et al.  Ping-pong robotics with high-speed vision system , 2012, 2012 12th International Conference on Control Automation Robotics & Vision (ICARCV).

[14]  Connor Schenck,et al.  Learning relational object categories using behavioral exploration and multimodal perception , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Vincenzo Lippiello,et al.  Robotic Ball Catching with an Eye-in-Hand Single-Camera System , 2015, IEEE Transactions on Control Systems Technology.

[16]  Navinda Kottege,et al.  Acoustics based terrain classification for legged robots , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Iain Murray,et al.  Fast $\epsilon$-free Inference of Simulation Models with Bayesian Conditional Density Estimation , 2016, 1605.06376.

[18]  M. Heckel,et al.  Can we obtain the coefficient of restitution from the sound of a bouncing ball? , 2016, Physical review. E.

[19]  Peter J. Ramadge,et al.  Learning to identify container contents through tactile vibration signatures , 2016, 2016 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR).

[20]  Jiajun Wu,et al.  Shape and Material from Sound , 2017, NIPS.

[21]  Maria Bauza,et al.  A probabilistic data-driven model for planar pushing , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Maria Bauzá,et al.  A Data-Efficient Approach to Precise and Controlled Pushing , 2018, CoRL.

[24]  Alberto Rodriguez,et al.  Friction Variability in Planar Pushing Data: Anisotropic Friction and Data-Collection Bias , 2018, IEEE Robotics and Automation Letters.

[25]  Ming C. Lin,et al.  ISNN: Impact Sound Neural Network for Audio-Visual Object Classification , 2018, ECCV.

[26]  Oliver Kroemer,et al.  Learning Audio Feedback for Estimating Amount and Flow of Granular Material , 2018, CoRL.

[27]  Bernhard Schölkopf,et al.  Reliable Real Time Ball Tracking for Robot Table Tennis , 2019, Robotics.

[28]  Dieter Fox,et al.  BayesSim: adaptive domain randomization via probabilistic inference for robotics simulators , 2019, Robotics: Science and Systems.

[29]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[30]  Andrea L. Thomaz,et al.  TuneNet: One-Shot Residual Tuning for System Identification and Sim-to-Real Robot Task Transfer , 2019, CoRL.

[31]  Norman Hendrich,et al.  Making Sense of Audio Vibration for Liquid Height Estimation in Robotic Pouring , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Yashraj S. Narang,et al.  Inferring the Material Properties of Granular Media for Robotic Tasks , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Martin Asenov,et al.  Vid2Param: Modeling of Dynamics Parameters From Video , 2019, IEEE Robotics and Automation Letters.

[34]  Dhiraj Gandhi,et al.  Swoosh! Rattle! Thump! - Actions that Sound , 2020, Robotics: Science and Systems.