The Use of Group Delay Features of Linear Prediction Model for Speaker Recognition

New text independent speaker identification method is presented. Phase spectrum of all-pole linear prediction (LP) model is used to derive the speech features. The features are represented by pairs of numbers that are calculated from group delay extremums of LP model spectrum. The first component of the pair is an argument of maximum of group delay of all pole LP model spectrum and the second is an estimation of spectrum bandwidth at the point of spectrum extremum. A similarity metric that uses group delay features is introduced. The metric is adapted for text independent speaker identification with general assumption that test speech channel may contain multiple speakers. It is demonstrated that automatic speaker recognition system with proposed features and similarity metric outperforms systems based on Gaussian mixture model with Mel frequency cepstral coefficients, formants, antiformants and pitch features.

[1]  Vitomir Struc,et al.  Gabor-Based Kernel Partial-Least-Squares Discrimination Features for Face Recognition , 2009, Informatica.

[2]  Antanas Lipeika,et al.  Speaker Recognition Based on the Use of Vocal Tract and Residue Signal LPC Parameters , 1999, Informatica.

[3]  Kuldip K. Paliwal,et al.  Short-time phase spectrum in speech processing: A review and some experimental results , 2007, Digit. Signal Process..

[4]  Justas Kranauskas,et al.  Fingerprint Minutiae Matching without Global Alignment Using Local Structures , 2022 .

[5]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[6]  H. Strube Linear prediction on a warped frequency scale , 1980 .

[7]  Antanas Lipeika,et al.  Speaker identification using vector quantization , 1995 .

[8]  Cai Jinhai,et al.  New method for extracting speech formants using LPC phase spectrum , 1993 .

[9]  E. Ambikairajah,et al.  Group delay features for speaker recognition , 2007, 2007 6th International Conference on Information, Communications & Signal Processing.

[10]  Slobodan Ribaric,et al.  A Novel Biometric Personal Verification System Based on the Combination of Palmprints and Faces , 2008, Informatica.

[11]  Bayya Yegnanarayana,et al.  Speech processing using group delay functions , 1991, Signal Process..

[12]  B. Šalna,et al.  Evaluation of Effectiveness of Different Methods in Speaker Recognition , 2010 .

[13]  Hsiao-Chuan Wang,et al.  Channel Effect Compensation in LSF Domain , 2003, EURASIP J. Adv. Signal Process..

[14]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[15]  KiselAndrej,et al.  Fingerprint Minutiae Matching without Global Alignment Using Local Structures , 2008 .