Combined Architecture of Adaptive Beamforming and Blind Source Separation for Speech Recognition of Intelligent Service Robots

Successful speech recognition in noisy environments for intelligent robots depends on the performance of preprocessing elements employed. Even though acoustic signals are often corrupted in the high noise level environment, speech recognition systems such as the widely-used HTK do not deal with signal distortion problems. We propose an architecture that effectively combines adaptive beamforming (ABF) and blind source separation (BSS) algorithms in the spatial domain. To avoid permutation ambiguity and heavy computational complexity in the BSS system, the adaptive generalized sidelobe canceller is employed in front of the BSS system. We slightly modified the conventional convolutive mixture model of the BSS for fast processing in hardware implementations. Unlike the conventional BSS, this does not suffer from permutation ambiguity since the target angle of the front-line beamformer is fixed so it always provides enhanced and reference noise signals to the predefined two inputs of the BSS. The proposed system also reduces heavy computations in the BSS when the BSS have more than two inputs. The proposed time domain approach can be easily implemented into hardware in real-time. We evaluated the structure and assessed its performance with a DSP module. The experimental results of speech recognition test show that the proposed combined system guarantees high speech recognition rate in the noisy environment and better performance than the ABF and BSS system.

[1]  Christopher V. Alvino,et al.  Geometric source separation: merging convolutive source separation with geometric beamforming , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[2]  Kari Torkkola,et al.  Blind separation of delayed sources based on information maximization , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[3]  Hiroshi Sawada,et al.  A robust approach to the permutation problem of frequency-domain blind source separation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Antonio Cantoni,et al.  Derivative constraints for broad-band element space antenna array processors , 1983 .

[5]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .

[6]  Yannick Deville,et al.  Self-adaptive separation of convolutively mixed signals with a recursive structure. Part I: Stability analysis and optimization of asymptotic behaviour , 1999, Signal Process..

[7]  L. J. Griffiths,et al.  An alternative approach to linearly constrained adaptive beamforming , 1982 .

[8]  Henry Cox,et al.  Robust adaptive beamforming , 2005, IEEE Trans. Acoust. Speech Signal Process..

[9]  Seokwon Bang,et al.  Speech enhancement and recognition using circular microphone array for service robots , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[10]  Hong Jeong,et al.  Parallel feedback network architecture for blind source separation , 2004 .

[11]  Tyseer Aboulnasr,et al.  Combined spatial/beamforming and time/frequency processing for blind source separation , 2005, 2005 13th European Signal Processing Conference.