An Adaptive Non Reference Anchor Array Framework for Distant Speech Recognition

Distant speech recognition over microphone arrays is challenging, especially in multi source environments. In this paper, a non reference anchor array (NRA) framework for distant speech recognition is proposed. The NRA framework uses a non reference anchor array to capture the interfering speech sources, in addition to the primary array that captures the speech source of interest. The framework uses a linearly constrained minimum variance beam former (LC-MV) beam former such that the signal coming from the look direction is preserved while rejecting correlated interferences coming from the same direction as the source of interest. The performance of the proposed method discussed herein is evaluated by conducting experiments on clean speech acquisition from distant microphones and also on distant speech recognition on the TIMIT and MONC databases. Experimental results obtained from the proposed method indicate a reasonable improvement over correlation, subspace and standard minimum variance beam forming methods.

[1]  Victor Zue,et al.  Speech database development at MIT: Timit and beyond , 1990, Speech Commun..

[2]  Jacob Benesty,et al.  On the optimal linear filtering techniques for noise reduction , 2007, Speech Commun..

[3]  Gary W. Elko,et al.  Spherical Microphone Arrays for 3D Sound Recording , 2004 .

[4]  Harry L. Van Trees,et al.  Optimum Array Processing , 2002 .

[5]  Bhaskar D. Rao,et al.  Robust Broadband Beamformer with Diagonally Loaded Constraint Matrix and Its Application to Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Jacob Benesty,et al.  New insights into the noise reduction Wiener filter , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Gary W. Elko,et al.  A highly scalable spherical microphone array based on an orthonormal decomposition of the soundfield , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[8]  J. Capon High-resolution frequency-wavenumber spectrum analysis , 1969 .

[9]  Jian Li,et al.  On robust Capon beamforming and diagonal loading , 2003, IEEE Trans. Signal Process..