Speaker identification by combining MFCC and phase information in noisy environments

In conventional speaker recognition methods based on MFCC, the phase information has been ignored. Recently, we proposed a method that integrated MFCC with the phase information on a speaker recognition method. Using the phase information, the speaker identification error rate was reduced by 78% for clean speech. In this paper, we describe the effectiveness of phase information for noisy environments on speaker identification. Integrationg MFCC with phase information, the speaker error identification rates were reduced by 20%∼70% in comparison with using only MFCC in noisy environments.

[1]  Sree Hari Krishnan Parthasarathi,et al.  Robustness of phase based features for speaker recognition , 2009, INTERSPEECH.

[2]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[4]  Hermann Ney,et al.  Using phase spectrum information for improved speech recognition performance , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[5]  Nengheng Zheng,et al.  Integration of Complementary Acoustic Features for Speaker Recognition , 2007, IEEE Signal Processing Letters.

[6]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[7]  Seiichi Nakagawa,et al.  Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 task , 2002, INTERSPEECH.

[8]  Eliathamby Ambikairajah,et al.  LS regularization of group delay features for speaker recognition , 2009, INTERSPEECH.

[9]  Shuichi Itahashi,et al.  JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research , 1999 .

[10]  Longbiao Wang,et al.  Speaker recognition by combining MFCC and phase information , 2010, INTERSPEECH.

[11]  Parham Aarabi,et al.  On the importance of phase in human speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Konstantin Markov,et al.  Integrating pitch and LPC-residual information with LPC-cepstrum for text-independent speaker recognition , 1999 .

[13]  James R. Glass,et al.  Robust Speaker Recognition in Noisy Conditions , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Bayya Yegnanarayana,et al.  Combining evidence from residual phase and MFCC features for speaker recognition , 2006, IEEE Signal Processing Letters.

[15]  Roberto Togneri,et al.  Robust speaker identification using combined feature selection and missing data recognition , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Longbiao Wang,et al.  High improvement of speaker identification and verification by combining MFCC and phase information , 2009, ICASSP.

[17]  Kuldip K. Paliwal,et al.  Usefulness of phase spectrum in human speech perception , 2003, INTERSPEECH.

[18]  Guangji Shi,et al.  Phased-Based Speech Processing , 2005 .