Text Independent Composite Speaker Identification/Verification Using Multiple Features

The main objective of this paper is to explore the effectiveness of feature selection for performing composite speaker identification/verification. We propose features such as line spectral frequency (LSF), differential line spectral frequency (DLSF), mel frequency cepstral coefficients (MFCC), discrete cosine transform cepstrum (DCTC), perceptual linear predictive cepstrum (PLP) and mel frequency perceptual linear predictive cepstrum (MF-PLP). These features are captured and training models are developed by K-means clustering procedure. A speaker identification system is evaluated on noise added test speeches and the experimental results reveal the performance of the proposed algorithm in identifying speakers based on minimum distance between test features and clusters and also highlight the best choice of feature set among all the proposed features for 50 speakers chosen randomly from "TIMIT" database. Analysis is performed on the identification results to emphasize the choice of features which produce better results for speaker verification with respect to equal error rate. In this work, F-ratio is computed as a theoretical measure to validate the experimental results for both identification and verification.

[1]  Rangarao Muralishankar,et al.  Pseudo Complex Cepstrum Using Discrete Cosine Transform , 2005, Int. J. Speech Technol..

[2]  Aaron E. Rosenberg,et al.  New techniques for automatic speaker verification , 1975 .

[3]  Hynek Hermansky,et al.  The challenge of inverse-E: the RASTA-PLP method , 1991, [1991] Conference Record of the Twenty-Fifth Asilomar Conference on Signals, Systems & Computers.

[4]  Hugo Cordeiro,et al.  Speaker Characterization with MLSFs , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[5]  Hong-Goo Kang,et al.  Speaker recognition based on transformed line spectral frequencies , 2004, Proceedings of 2004 International Symposium on Intelligent Signal Processing and Communication Systems, 2004. ISPACS 2004..

[6]  S. Arivazhagan,et al.  Fingerprint Verification Using Gabor Co-occurrence Features , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[7]  S. Das,et al.  A scheme for speech processing in automatic speaker verification , 1971 .

[8]  A. Revathi,et al.  A noise reduction technique of speech signal using ICA and spectral analysis , 2007 .

[9]  Y. Venkataramani,et al.  Effectiveness of LP Derived Features and DCTC in Twins Identification - Iterative Speaker Clustering Approach , 2007, International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007).

[10]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[11]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[12]  牧野 正三 Perceptually based processing in automatic speech recognition , 1986 .

[13]  S. Guruprasad,et al.  AANN models for speaker recognition based on difference cepstrals , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..