论文信息 - On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech

On enhancing feature sequence filtering with filter-bank energy transformation in speaker verification with telephone speech

In this paper a novel feature enhancing method for channel robustness with short utterances is employed. The transform reduces the time-varying component of the channel distortion by applying a band-pass filter along the filter-bank domain on a frame-by-frame basis. This procedure enhances the channel cancelling effect given by techniques based on feature trajectory filtering. The transformation parameters are defined employing relative importance analysis based on a discriminant function. In text-dependent speaker verification with telephone speech the transform leads to a reduction in the EER of 10.8%, and further improvements of 23.5% and 40% when combined with RASTA or CMN, respectively.

Néstor Becerra Yoma | Claudio Garretón

[1] Sun-Yuan Kung,et al. Stochastic Feature Transformation with Divergence-Based Out-of-Handset Rejection for Robust Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[2] Rong Zheng,et al. A Comparative Study of Feature and Score Normalization for Speaker Verification , 2006, ICB.

[3] Philip C. Woodland,et al. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[4] Misha Pavel,et al. On the relative importance of various components of the modulation spectrum for automatic speech recognition , 1999, Speech Commun..

[5] Zekeriya Tufekci. Convolutional Bias Removal Based on Normalizing the Filterbank Spectral Magnitude , 2007, IEEE Signal Processing Letters.

[6] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[7] Chin-Hui Lee,et al. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains , 1994, IEEE Trans. Speech Audio Process..

[8] Biing-Hwang Juang,et al. Signal bias removal by maximum likelihood estimation for robust telephone speech recognition , 1996, IEEE Trans. Speech Audio Process..

[9] T.F. Quatieri,et al. The effects of telephone transmission degradations on speaker recognition performance , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[10] Ho YoungJung. Filtering of Filter-Bank Energies for Robust Speech Recognition , 2004 .

[11] S. Furui,et al. Cepstral analysis technique for automatic speaker verification , 1981 .

[12] Néstor Becerra Yoma,et al. Channel Robust Feature Transformation Based on Filter-Bank Energy Filtering , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[13] Climent Nadeu,et al. Time and frequency filtering of filter-bank energies for robust HMM speech recognition , 2000, Speech Commun..

[14] Longbiao Wang,et al. Robust Distant Speech Recognition by Combining Position-Dependent CMN with Conventional CMN , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.