Active binaural localization of multiple sound sources

Sound source localization serves as a significant capability of autonomous robots that conduct missions such as search and rescue, and target tracking in challenging environments. However, localization of multiple sound sources and static sound source tracking in self-motion are both difficult tasks, especially when the number of sound sources or reflections increase. This study presents two robotic hearing approaches based on a human perception model (Wallach, 1939) that combines interaural time difference (ITD) and head turn motion data to locate sound sources. The first method uses a fitting-based approach to recognize the changing trends of the cross-correlation function of binaural inputs. The effectiveness of this method was validated using data collected from a two-microphone array rotating in a non-anechoic environment, and the experiments reveal its ability to separate and localize up to three sound sources of the same spectral content (white noise) at different azimuth and elevation angles. The second method uses an extended Kalman filter (EKF) that estimates the orientation of a sound source by fusing the robot's self-motion and ITD data to reduce the localization errors recursively. This method requires limited memory resources and is able to keep tracking the relative position change of a number of static sources when the robot moves. In the experiments, up to three sources can be tracked simultaneously with a two-microphone array. Sound source localization methods based on robot head motion and ITD data are proposed.Multiple sources with overlapping spectra are localized with binaural inputs in non-anechoic spaces.The data fitting method separates the sources based on correlogram changing patterns.An EKF is established for each sound source to keep track of it during self-motion.

[1]  Gökhan Ince,et al.  Using binaural and spectral cues for azimuth and elevation localization , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Radu Horaud,et al.  Acoustic Space Learning for Sound-Source Separation and Localization on Binaural Manifolds , 2014, Int. J. Neural Syst..

[3]  Hanseok Ko,et al.  Robust sound source localization using a Wiener filter , 2013, 2013 IEEE 18th Conference on Emerging Technologies & Factory Automation (ETFA).

[4]  Timothy W. McLain,et al.  Small Unmanned Aircraft: Theory and Practice , 2012 .

[5]  DeLiang Wang,et al.  Binaural tracking of multiple moving sources , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  F. Keyrouz,et al.  An Enhanced Binaural 3D Sound Localization Algorithm , 2006, 2006 IEEE International Symposium on Signal Processing and Information Technology.

[7]  Keisuke Nakamura,et al.  Intelligent Sound Source Localization and its application to multimodal human tracking , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Laurent Kneip,et al.  Binaural model for artificial spatial sound localization based on interaural time delays and movements of the interaural axis. , 2008, The Journal of the Acoustical Society of America.

[9]  Jean Rouat,et al.  Robust sound source localization using a microphone array on a mobile robot , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[10]  Kazuhito Yokoi,et al.  Sound source localization using a single-point stereo microphone for robots , 2015, 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA).

[11]  Kazuhiro Nakadai,et al.  Sound source separation of moving speakers for robot audition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  G. Carter,et al.  The generalized correlation method for estimation of time delay , 1976 .

[13]  Peter S. Maybeck,et al.  Stochastic Models, Estimation And Control , 2012 .

[14]  Liang Sun,et al.  Dynamic binaural sound source localization with interaural time difference cues: Artificial listeners , 2015 .

[15]  S. Perrett,et al.  The effect of head rotations on vertical plane sound localization. , 1997, The Journal of the Acoustical Society of America.

[16]  Hiroshi Mizoguchi,et al.  Three ring microphone array for 3D sound localization and separation for mobile robot audition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  William A. Yost,et al.  Localizing the sources of two independent noises: role of time varying amplitude differences. , 2013, The Journal of the Acoustical Society of America.

[18]  Anbar Najam,et al.  Judging sound rotation when listeners and sounds rotate: Sound source localization is a multisystem process. , 2015, The Journal of the Acoustical Society of America.

[19]  Boaz Rafaely,et al.  Microphone Array Signal Processing , 2008 .

[20]  Volker Willert,et al.  A Probabilistic Model for Binaural Sound Localization , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[21]  Tetsuya Ogata,et al.  Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Fakheredine Keyrouz,et al.  Advanced Binaural Sound Localization in 3-D for Humanoid Robots , 2014, IEEE Transactions on Instrumentation and Measurement.

[23]  Richard F. Lyon A computational model of binaural localization and separation , 1983, ICASSP.

[24]  Erik Berglund,et al.  Sound source localisation through active audition , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Keisuke Nakamura,et al.  Outdoor auditory scene analysis using a moving microphone array embedded in a quadrocopter , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Alban Portello,et al.  Acoustic models and Kalman filtering strategies for active binaural sound localization , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  J. C. Middlebrooks Sound localization. , 2015, Handbook of clinical neurology.

[28]  Rong Liu,et al.  Azimuthal source localization using interaural coherence in a robotic dog: modeling and application , 2010, Robotica.

[29]  J. Tim , Acoustics – Sound Fields and Transducers , Elsevier – Academic Press 2012 , 2015 .

[30]  Greg Welch,et al.  Welch & Bishop , An Introduction to the Kalman Filter 2 1 The Discrete Kalman Filter In 1960 , 1994 .

[31]  Liang Sun,et al.  Dynamic binaural sound source localization with ITD cues: Human listeners , 2015 .

[32]  Emmanuel Vincent,et al.  Sound Source Separation , 2011 .

[33]  Guy J. Brown,et al.  Computational Auditory Scene Analysis: Principles, Algorithms, and Applications , 2006 .

[34]  William A. Yost,et al.  Front-back confusions when sources and listeners rotate , 2015 .

[35]  J. Blauert Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .

[36]  Hiroaki Kitano,et al.  Real-time sound source localization and separation for robot audition , 2002, INTERSPEECH.

[37]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[38]  L A JEFFRESS,et al.  A place theory of sound localization. , 1948, Journal of comparative and physiological psychology.

[39]  José Santos-Victor,et al.  Sound Localization for Humanoid Robots - Building Audio-Motor Maps based on the HRTF , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  DeLiang Wang,et al.  Binaural Localization of Multiple Sources in Reverberant and Noisy Environments , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[41]  Radu Horaud,et al.  2D sound-source localization on the binaural manifold , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[42]  Xuan Zhong,et al.  Dynamic Spatial Hearing by Human and Robot Listeners , 2015 .

[43]  Hiroshi G. Okuno,et al.  Improved sound source localization in horizontal plane for binaural robot audition , 2014, Applied Intelligence.

[44]  W. G. Gardner,et al.  3-D Audio Using Loudspeakers , 1998 .

[45]  W. Yost,et al.  Sound source localization identification accuracy: bandwidth dependencies. , 2014, The Journal of the Acoustical Society of America.

[46]  Tetsuya Ogata,et al.  Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  Jingdong Chen,et al.  Microphone Array Signal Processing , 2008 .

[48]  David Feathers,et al.  Holding a Multi-touch Tablet with One Hand: 3D Modeling and Visualization of Hand and Wrist Postures , 2012 .

[49]  Hiroaki Kitano,et al.  Robot recognizes three simultaneous speech by active audition , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[50]  R. Woodworth,et al.  PSYCHIATRY AND EXPERIMENTAL PSYCHOLOGY , 1906 .

[51]  D. M. Green,et al.  Sound localization by human listeners. , 1991, Annual review of psychology.

[52]  Keisuke Nakamura,et al.  Real-time super-resolution Sound Source Localization for robots , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[53]  W. Yost,et al.  Relationship between postural stability and spatial hearing. , 2013, Journal of the American Academy of Audiology.

[54]  Martin Heckmann,et al.  Real-time Sound Localization With a Binaural Head-system Using a Biologically-inspired Cue-triple Mapping , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.