A Robust Joint GSS And Source Localization Algorithm For Robot Audition In Strong Reverberant Environment

The research on robot audition aims to study advanced techniques to help robots acquire acoustic information from the ambient environments. In many real situations, the environment may be adverse like being highly reverberant, and the microphone signal received by the robots may consist of a superposition of several sounds. Geometric source separation (GSS) algorithm uses prior geometric information to separate simultaneously present sound sources and thus is suitable for robot audition applications. However, the performance of GSS deteriorates as the reverberation time increases even if highly accurate sources locations are available, let alone classic source localization (SL) methods themselves can hardly perform well in reverberant environments. In this paper, a joint GSS and SL algorithm is proposed for robot audition applications in strong reverberant environments. This new method estimates the parameters of blind dereverberation (BD), SL and GSS alternatively, which is similar to the conditional separation and dereverberation (CSD) method to release the one-source assumption of many BD algorithms. Furthermore, the proposed method can also be used as a robust SL algorithm alone if necessary. Experimental results verified the robust performance achieved by the proposed algorithm.

[1]  Lucas C. Parra,et al.  Convolutive blind separation of non-stationary sources , 2000, IEEE Trans. Speech Audio Process..

[2]  R. O. Schmidt,et al.  Multiple emitter location and signal Parameter estimation , 1986 .

[3]  B.D. Van Veen,et al.  Beamforming: a versatile approach to spatial filtering , 1988, IEEE ASSP Magazine.

[4]  Rémi Gribonval,et al.  Performance measurement in blind audio source separation , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  Darren B. Ward,et al.  Particle filtering algorithms for tracking an acoustic source in a reverberant environment , 2003, IEEE Trans. Speech Audio Process..

[6]  Tom Bäckström,et al.  LINEAR PREDICTIVE MODELLING OF SPEECH - CONSTRAINTS AND LINE SPECTRUM PAIR DECOMPOSITION , 2004 .

[7]  Christopher V. Alvino,et al.  Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[8]  François Michaud,et al.  The ManyEars open framework , 2013, Autonomous Robots.

[9]  Biing-Hwang Juang,et al.  Blind speech dereverberation with multi-channel linear prediction based on short time fourier transform representation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Jean Rouat,et al.  Enhanced robot audition based on microphone array source separation with post-filter , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[11]  Jont B. Allen,et al.  Image method for efficiently simulating small‐room acoustics , 1976 .

[12]  Takuya Yoshioka,et al.  Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Hiroshi G. Okuno,et al.  Design and Implementation of Robot Audition System 'HARK' — Open Source Software for Listening to Three Simultaneous Speakers , 2010, Adv. Robotics.