Speaker Recognition on Mobile Phone: Using Wavelet, Cepstral Coefficients and Probabilisitc Neural Network

With the used of speech as a biometric in remote login or register, speaker recognition is widely applied on the smart mobile phone. However, the accuracy of speaker recognition usually drop off rapidly for the rough speech signal transmitted by the phone channel. This paper propose a new robust speaker recognition model for mobile phone context. The cepstral coefficients and its delta and delta-delta coefficients are extracted from the signal as the feature vector. Moreover, the discrete wavelet transform and threshold de-noising process are employed to enhance the performance of the feature in the noisy environment. The extracted feature vectors are used as input to the probabilistic neural network, which is a faster classifier compared with the back-propagation neural networks. To simulate the speech transmitted over the mobile phone channel, the TIMIT corpus is downsampled to 8KHz, and 20dB Gaussian white noise is added into the corpus. The experimental results show that the proposed model obtain high accuracy and can increase the true positive rate for a given false positive rate in a noisy environment. Therefore, the proposed model is suitable for mobile phone context.

[1]  M. VikramC.,et al.  Phoneme independent pathological voice detection using wavelet based MFCCs, GMM-SVM hybrid classifier , 2013, 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[2]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[3]  Jing Bai,et al.  The Speech Recognition System Based On Bark Wavelet MFCC , 2006, 2006 8th international Conference on Signal Processing.

[4]  Jagannath H. Nirmal,et al.  A unique approach in text independent speaker recognition using MFCC feature sets and probabilistic neural network , 2015, 2015 Eighth International Conference on Advances in Pattern Recognition (ICAPR).

[5]  Lu Zhen,et al.  Probabilistic Neural Network Based on Data Field , 2011 .

[6]  A. Grossmann,et al.  DECOMPOSITION OF HARDY FUNCTIONS INTO SQUARE INTEGRABLE WAVELETS OF CONSTANT SHAPE , 1984 .

[7]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[8]  Mahesh Chandra,et al.  Hybrid of wavelet and MFCC features for speaker verification , 2011, 2011 World Congress on Information and Communication Technologies.

[9]  Saeed Kermani,et al.  An Undecimated Wavelet-based Method for Cochlear Implant Speech Processing , 2014, Journal of medical signals and sensors.

[10]  Ye Datian,et al.  Application of Wavelet in Speech Processing of Cochlear Implant , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[11]  Anupam Shukla,et al.  Speaker Identification using Wavelet Analysis and Modular Neural Networks , 2013 .

[12]  C. M. Vikram,et al.  A Wavelet Based MFCC Approach for the Phoneme Independent Pathological Voice Detection , 2013, 2013 Third International Conference on Advances in Computing and Communications.

[13]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[14]  Abeer Alwan,et al.  Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs , 2014, IEEE Signal Processing Letters.

[15]  Astik Biswas,et al.  Feature extraction technique using ERB like wavelet sub-band periodic and aperiodic decomposition for TIMIT phoneme recognition , 2014, Int. J. Speech Technol..

[16]  Haizhou Li,et al.  An overview of text-independent speaker recognition: From features to supervectors , 2010, Speech Commun..

[17]  Todor Ganchev,et al.  TEXT-INDEPENDENT SPEAKER VERIFICATION BASED ON PROBABILISTIC NEURAL NETWORKS , 2002 .

[18]  Cassia Valentini-Botinhao,et al.  Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[19]  Ramesh Kumar Sunkaria,et al.  An efficient wavelet based ECG de-noising using level dependent thresholding , 2015, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).

[20]  K. R. Shankar Kumar,et al.  Speaker Identification Using Discrete Wavelet Transform , 2015, J. Comput. Sci..

[21]  Jiren Xu,et al.  Speech Signal Feature Extraction Based on Wavelet Transform , 2011, 2011 International Conference on Intelligent Computation and Bio-Medical Instrumentation.

[22]  Kshamamayee Dash,et al.  Speaker Identification using Mel Frequency Cepstral Coefficient and BPNN , 2012 .

[23]  Mounir Samet,et al.  Exploring Wavelet Transform Based Methodology for Cochlear Prosthesis Advanced Speech Processing Strategy , 2014 .

[24]  S. Mallat A wavelet tour of signal processing , 1998 .

[25]  Sadaoki Furui,et al.  A text-independent speaker recognition method robust against utterance variations , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[26]  Abdelmalik Taleb-Ahmed,et al.  Robust remote speaker recognition system based on AR-MFCC features and efficient speech activity detection algorithm , 2014, 2014 11th International Symposium on Wireless Communications Systems (ISWCS).

[27]  Derya Avci,et al.  An expert system for speaker identification using adaptive wavelet sure entropy , 2009, Expert Syst. Appl..