论文信息 - Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain

Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain

In this paper, we present a new two-microphone approach that improves speech recognition accuracy when speech is masked by other speech. The algorithm improves on previous systems that have been successful in separating signals based on differences in arrival time of signal components from two microphones. The present algorithm differs from these efforts in that the signal selection takes place in the frequency domain. We observe that additional smoothing of the phase estimates over time and frequency is needed to support adequate speech recognition performance. We demonstrate that the algorithm described in this paper provides better recognition accuracy than timedomain-based signal separation algorithms, and at less than 10 percent of the computation cost.

Richard M. Stern | Bhiksha Raj | Chanwoo Kim | Kshitiz Kumar

[1] Richard M. Stern,et al. Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings , 2009, Speech Commun..

[2] Richard M. Stern,et al. Model Compensation and Matched Condition Methods for Robust Speech Recognition , 2002, Noise Reduction in Speech Applications.

[3] Richard M. Stern,et al. The effects of background music on speech recognition accuracy , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4] B. Moore,et al. A revision of Zwicker's loudness model , 1996 .

[5] Hyung-Min Park,et al. Binaural and Multiple-Microphone Signal Processing Motivated by Auditory Perception , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[6] Richard M. Stern,et al. Physiologically-motivated synchrony-based processing for robust automatic speech recognition , 2006, INTERSPEECH.

[7] H. Colburn,et al. Models of Sound Localization , 2005 .

[8] Nathaniel I. Durlach,et al. Chapter 11 – MODELS OF BINAURAL INTERACTION , 1978 .

[9] Richard M. Stern,et al. Chapter 10 – Models of Binaural Interaction , 1995 .

[10] Richard M. Stern,et al. Signal and Feature Compensa-tion Methods for Robust Speech Recognition , 2002 .

[11] DeLiang Wang,et al. Binary and ratio time-frequency masks for robust speech recognition , 2006, Speech Commun..