Segment-level pyramid match kernels for the classification of varying length patterns of speech using SVMs

Classification of long duration speech, represented as varying length sets of feature vectors using support vector machine (SVM) requires a suitable kernel. In this paper we propose a novel segment-level pyramid match kernel (SLPMK) for the classification of varying length patterns of long duration speech represented as sets of feature vectors. This kernel is designed by partitioning the speech signal into increasingly finer segments and matching the corresponding segments. We study the performance of the SVM-based classifiers using the proposed SLPMKs for speech emotion recognition and speaker identification and compare with that of the SVM-based classifiers using other dynamic kernels.

[1]  Mahesan Niranjan,et al.  Data-dependent kernels in svm classification of speech patterns , 2000, INTERSPEECH.

[2]  Haizhou Li,et al.  A GMM-based probabilistic sequence kernel for speaker verification , 2007, INTERSPEECH.

[3]  Haizhou Li,et al.  A GMM supervector Kernel with the Bhattacharyya distance for SVM based speaker recognition , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Stefan Steidl,et al.  Automatic classification of emotion related user states in spontaneous children's speech , 2009 .

[6]  N. Boujemaa,et al.  The intermediate matching kernel for image local features , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[7]  Astrid Paeschke,et al.  A database of German emotional speech , 2005, INTERSPEECH.

[8]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[9]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[10]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Douglas E. Sturim,et al.  Support vector machines using GMM supervectors for speaker verification , 2006, IEEE Signal Processing Letters.

[12]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[13]  Alvin F. Martin,et al.  The NIST speaker recognition evaluation program , 2005 .

[14]  Chellu Chandra Sekhar,et al.  GMM-Based Intermediate Matching Kernel for Classification of Varying Length Patterns of Long Duration Speech Using Support Vector Machines , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[15]  Chellu Chandra Sekhar,et al.  Speaker recognition using pyramid match kernel based support vector machines , 2012, Int. J. Speech Technol..