PERFORMING ACCURATE SPEAKER RECOGNITION BY USE OF SVM AND CEPSTRAL FEATURES

The task of performing speaker recognition over voice recordings is an active research area in the relevant literature in which many applications has been proposed so far.  In this study, speaker recognition is performed over cepstral features extracted from raw voice recordings. Some of the most prominent cepstral feature selection methods, namely, LPC, LPCC, MFCC, PLP and RASTA-PLP are utilized and their contribution to the performance of the applied method is investigated. Obtained features are handled by SVM classification algorithm to finalize the speaker recognition task. As a result, it is observed that cepstral feature selection methods such as LPCC and MFCC combined with SVM classification result in around 97% accuracy.

[1]  Sazali Yaacob,et al.  Acoustic Analysis of Formants Across Genders and Ethnical Accents in Malaysian English Using ANOVA , 2013 .

[2]  Kaïs Ouni,et al.  A bio-inspired feature extraction for robust speech recognition , 2014, SpringerPlus.

[3]  Ratnadeep R. Deshmukh,et al.  Comparative Study of Isolated Word Recognition System for Hindi Language , 2015 .

[4]  Mahpara Hyder Chowdhury Speech based gender identification using empirical mode decomposition (EMD) , 2014 .

[5]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[6]  Paul J. M. Havinga,et al.  A Survey on the Feasibility of Sound Classification on Wireless Sensor Nodes , 2015, Sensors.

[7]  Babasaheb Ambedkar,et al.  A Comparative Study of Feature Extraction Techniques for Speech Recognition System , 2014 .

[9]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[10]  Douglas D. O'Shaughnessy,et al.  Interacting with computers by voice: automatic speech recognition and synthesis , 2003, Proc. IEEE.

[11]  M. Yıldız,et al.  Comparison of different classification methods for the preictal stage detection in EEG signals , 2017 .

[12]  Om Prakash Prabhakar,et al.  Comparative Analysis of Different Feature Extraction and Classifier Techniques for Speaker Identification Systems : A Review , 2014 .

[13]  E. Ambikairajah Emerging features for speaker recognition , 2007, 2007 6th International Conference on Information, Communications & Signal Processing.

[14]  I. Gavat,et al.  A Comparative Study of Feature Extraction Methods Applied to Continuous Speech Recognition in Romanian Language , 2006, Proceedings ELMAR 2006.

[15]  Hynek Hermansky,et al.  RASTA processing of speech , 1994, IEEE Trans. Speech Audio Process..

[16]  Şenol Erdoğmuş,et al.  Destek Vektör Makineleriyle Sınıflandırma Problemlerinin Çözümü İçin Çekirdek Fonksiyonu Seçimi , 2014 .

[17]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[18]  Yu Shyr,et al.  Improved prediction of lysine acetylation by support vector machines. , 2009, Protein and peptide letters.

[19]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[20]  N. Uma Maheswari,et al.  A Hybrid model of Neural Network Approach for Speaker independent Word Recognition , 2010 .

[21]  K. P. Soman,et al.  Machine Learning with SVM and other Kernel methods , 2009 .

[22]  Eliathamby Ambikairajah Emerging Features for Speaker Recognition Invited Paper , 2007 .

[23]  Oh-Wook Kwon,et al.  Speech feature analysis using variational Bayesian PCA , 2003, IEEE Signal Process. Lett..