论文信息 - Step-size parameter adaptation of multi-channel semi-blind ICA with piecewise linear model for barge-in-able robot audition

Step-size parameter adaptation of multi-channel semi-blind ICA with piecewise linear model for barge-in-able robot audition

This paper describes a step-size parameter adaptation technique of multi-channel semi-blind independent component analysis (MCSB-ICA) for a “barge-in-able” robot audition system. By “barge-in”, we mean that the user can speak simultaneously when the robot is speaking.We focused on MCSB-ICA to achieve such an audition system because it can separate a user's and a robot's speech under reverberant environments. The problem with MCSB-ICA for robot audition is the slow speed of convergence in estimating a separation filter due to its step-size parameters. Many optimization methods cannot be adopted because their computational costs are proportional to the 2nd order of the reverberation time. Our method yields adaptive step-size parameters with MCSB-ICA at low computational costs. It is based on three techniques; 1) recursive expression of the separation process, 2) a piecewise linear model of the step-size of the separation filter, and 3) adaptive step-size parameters with a sub-ICA-filter. Experimental results show that our approach attains faster convergence speed and lower computational costs than those with a fixed step-size parameter.

[1] Shoko Araki,et al. Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2] Kiyohiro Shikano,et al. Distant talking robust speech recognition using late reflection components of room impulse response , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3] Shoji Makino,et al. Exponentially weighted stepsize NLMS adaptive filter based on the statistics of a room impulse response , 1993, IEEE Trans. Speech Audio Process..

[4] Kiyohiro Shikano,et al. Barge-in- and noise-free spoken dialogue interface based on sound field control and semi-blind source separation , 2007, 2007 15th European Signal Processing Conference.

[5] Shoko Araki,et al. The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech , 2003, IEEE Trans. Speech Audio Process..

[6] Tomohiro Nakatani,et al. An integrated method for blind separation and dereverberation of convolutive audio mixtures , 2008, 2008 16th European Signal Processing Conference.

[7] Shun-ichi AMARIyy,et al. NATURAL GRADIENT LEARNING WITH A NONHOLONOMIC CONSTRAINT FOR BLIND DECONVOLUTION OF MULTIPLE CHANNELS , 1999 .

[8] Kazuhiro Nakadai,et al. Adaptive step-size parameter control for real-world blind source separation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[9] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[10] Hideaki Sakai,et al. A New Adaptive Filter Algorithm for System Identification using Independent Component Analysis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[11] Tetsuya Ogata,et al. ICA-based efficient blind dereverberation and echo cancellation method for barge-in-able robot audition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[12] Biing-Hwang Juang,et al. Blind speech dereverberation with multi-channel linear prediction based on short time fourier transform representation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.