Step-size parameter adaptation of multi-channel semi-blind ICA with piecewise linear model for barge-in-able robot audition

This paper describes a step-size parameter adaptation technique of multi-channel semi-blind independent component analysis (MCSB-ICA) for a “barge-in-able” robot audition system. By “barge-in”, we mean that the user can speak simultaneously when the robot is speaking.We focused on MCSB-ICA to achieve such an audition system because it can separate a user's and a robot's speech under reverberant environments. The problem with MCSB-ICA for robot audition is the slow speed of convergence in estimating a separation filter due to its step-size parameters. Many optimization methods cannot be adopted because their computational costs are proportional to the 2nd order of the reverberation time. Our method yields adaptive step-size parameters with MCSB-ICA at low computational costs. It is based on three techniques; 1) recursive expression of the separation process, 2) a piecewise linear model of the step-size of the separation filter, and 3) adaptive step-size parameters with a sub-ICA-filter. Experimental results show that our approach attains faster convergence speed and lower computational costs than those with a fixed step-size parameter.

[1]  Shoko Araki,et al.  Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Kiyohiro Shikano,et al.  Distant talking robust speech recognition using late reflection components of room impulse response , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3]  Shoji Makino,et al.  Exponentially weighted stepsize NLMS adaptive filter based on the statistics of a room impulse response , 1993, IEEE Trans. Speech Audio Process..

[4]  Kiyohiro Shikano,et al.  Barge-in- and noise-free spoken dialogue interface based on sound field control and semi-blind source separation , 2007, 2007 15th European Signal Processing Conference.

[5]  Shoko Araki,et al.  The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[6]  Tomohiro Nakatani,et al.  An integrated method for blind separation and dereverberation of convolutive audio mixtures , 2008, 2008 16th European Signal Processing Conference.

[7]  Shun-ichi AMARIyy,et al.  NATURAL GRADIENT LEARNING WITH A NONHOLONOMIC CONSTRAINT FOR BLIND DECONVOLUTION OF MULTIPLE CHANNELS , 1999 .

[8]  Kazuhiro Nakadai,et al.  Adaptive step-size parameter control for real-world blind source separation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[10]  Hideaki Sakai,et al.  A New Adaptive Filter Algorithm for System Identification using Independent Component Analysis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11]  Tetsuya Ogata,et al.  ICA-based efficient blind dereverberation and echo cancellation method for barge-in-able robot audition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12]  Biing-Hwang Juang,et al.  Blind speech dereverberation with multi-channel linear prediction based on short time fourier transform representation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.