Group delay features for speaker recognition

Group delay is proposed as an effective means of representing spectral phase information as a feature in speaker recognition. Robustness of group delay features is difficult to achieve, since the spiky nature of the group delay masks the fine structure of the group delay. In this paper, two features based on group delay are proposed by reducing the effect of spikes with two different approaches. The first is log compression, to address the masking effects of the spikes, and the second is to use a sub-band based approach, where masking is restricted within certain bands containing the spikes. The purpose of this paper is to introduce different types of group delay feature extraction methods. The two features are evaluated on the cellular NIST 2001 database.

[1]  Hema A. Murthy,et al.  The modified group delay function and its application to phoneme recognition , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[2]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[3]  Eliathamby Ambikairajah,et al.  Speaker Identification using FM Features , 2006 .

[4]  Alvin F. Martin,et al.  The NIST speaker recognition evaluation program , 2005 .

[5]  Kuldip K. Paliwal,et al.  Short-time phase spectrum in speech processing: A review and some experimental results , 2007, Digit. Signal Process..

[6]  Rajesh M. Hegde,et al.  Cluster and Intrinsic Dimensionality Analysis of the Modified Group Delay Feature for Speaker Classification , 2004, ICONIP.

[7]  Kuldip K. Paliwal,et al.  Evaluation of the modified group delay feature for isolatedword recognition , 2005, Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005..

[8]  Thierry Dutoit,et al.  Chirp group delay analysis of speech signals , 2007, Speech Commun..

[9]  Satoshi Nakamura,et al.  Efficient representation of short-time phase based on group delay , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[10]  Yau-Tarng Juang,et al.  Projection-based group delay scheme for speech recognition , 1996, IEEE Trans. Speech Audio Process..

[11]  Paavo Alku,et al.  Using group delay function to assess glottal flows estimated by inverse filtering , 2005 .

[12]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[13]  H.A. Murthy,et al.  Automatic language identification and discrimination using the modified group delay feature , 2005, Proceedings of 2005 International Conference on Intelligent Sensing and Information Processing, 2005..

[14]  Rajesh M. Hegde,et al.  Speech processing using joint features derived from the modified group delay function , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[15]  Bayya Yegnanarayana,et al.  Formant extraction from group delay function , 1991, Speech Commun..

[16]  Hema A. Murthy,et al.  Subband-Based Group Delay Segmentation of Spontaneous Speech into Syllable-Like Units , 2004, EURASIP J. Adv. Signal Process..

[17]  Bayya Yegnanarayana,et al.  Speech processing using group delay functions , 1991, Signal Process..

[18]  Rajesh M. Hegde,et al.  Application of the modified group delay function to speaker identification and discrimination , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..