Novel Feature Extraction Algorithm using DWT and Temporal Statistical Techniques for Word Dependent Speaker’s Recognition

In the paper, a novel method for word-dependent speaker recognition is proposed, based on unique temporal statistical techniques on Discrete Wavelet Transform (DWT) coefficients. The speaker is to be recognized based on words from a specific dataset. For demonstration, the dataset of words used are digits ranging from zero to nine. In presented algorithm, wavelet analysis is being considered as Discrete Wavelet Transform is able to analyze time-frequency multi-resolution for quasi stationary speech signals. The speech signal of a word from dataset, is decomposed using Symlet 7 wavelet as mother wavelet. A 1D feature vector is extracted from approximate coefficients of DWT. Temporal-statistical methods are used to construct unique features for each word in dataset. As compare to methods like LPC, LPCC, MFCC and Power spectral analysis (FFT), the proposed method gives more a better robust feature for speaker recognition. The details of methodology is presented followed by results obtained and discussion.

[1]  Fayyaz A. Afsar,et al.  Wavelet transform based automatic speaker recognition , 2009, 2009 IEEE 13th International Multitopic Conference.

[2]  M. L. Dewal,et al.  Epileptic seizures detection in EEG using DWT-based ApEn and artificial neural network , 2012, Signal, Image and Video Processing.

[3]  A. Geva,et al.  ECG feature extraction using optimal mother wavelet , 2000, 21st IEEE Convention of the Electrical and Electronic Engineers in Israel. Proceedings (Cat. No.00EX377).

[4]  Sadaoki Furui,et al.  Recent advances in speaker recognition , 1997, Pattern Recognit. Lett..

[5]  Tomi Kinnunen,et al.  Spectral Features for Automatic Text-Independent Speaker Recognition , 2003 .

[6]  Shailja Shukla,et al.  ECG signal processing for abnormalities detection using multi-resolution wavelet transform and Artificial Neural Network classifier , 2013 .

[7]  Jr. J.P. Campbell,et al.  Speaker recognition: a tutorial , 1997, Proc. IEEE.

[8]  Shung-Yung Lung Feature extracted from wavelet decomposition using biorthogonal Riesz basis for text-independent speaker recognition , 2008, Pattern Recognit..

[9]  Ching-Tang Hsieh,et al.  Robust speech features based on wavelet transform with application to speaker identification , 2002 .

[10]  Andrzej Drygajlo,et al.  Entropy based voice activity detection in very noisy conditions , 2001, INTERSPEECH.

[11]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[12]  Mei-Yuh Hwang,et al.  The SPHINX-II speech recognition system: an overview , 1993, Comput. Speech Lang..

[13]  Douglas D. O'Shaughnessy,et al.  Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition , 1999, IEEE Trans. Speech Audio Process..

[14]  Hemant A. Patil,et al.  Combining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech , 2015, INTERSPEECH.