Efficient Multiple Kernel Support Vector Machine Based Voice Activity Detection

In this letter, we propose a multiple kernel support vector machine (MK-SVM) method for multiple feature based VAD. To make the MK-SVM based VAD practical, we adapt the multiple kernel learning (MKL) thought to an efficient cutting-plane structural SVM solver. We further discuss the performances of the MK-SVM with two different optimization objectives, in terms of minimum classification errors (MCE) and improvement of receiver operating characteristic (ROC) curves. Our experimental results show that the proposed method not only leads to better global performances by taking the advantages of multiple features but also has a low computational complexity.

[1]  Thorsten Joachims,et al.  Sparse kernel SVMs via cutting-plane training , 2009, Machine-mediated learning.

[2]  Thorsten Joachims,et al.  A support vector method for multivariate performance measures , 2005, ICML.

[3]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[4]  John H. L. Hansen,et al.  Discriminative Training for Multiple Observation Likelihood Ratio Based Voice Activity Detection , 2010, IEEE Signal Processing Letters.

[5]  Joon-Hyuk Chang,et al.  Voice activity detection based on statistical models and machine learning approaches , 2010, Comput. Speech Lang..

[6]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[7]  Ji Wu,et al.  Maximum Margin Clustering Based Statistical VAD With Multiple Observation Compound Feature , 2011, IEEE Signal Processing Letters.

[8]  Javier Ramírez,et al.  Statistical voice activity detection using a multiple observation likelihood ratio test , 2005, IEEE Signal Processing Letters.

[9]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[10]  Juan Manuel Górriz,et al.  Improved Voice Activity Detection Using Contextual Multiple Hypothesis Testing for Robust Speech Recognition , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  Zenglin Xu,et al.  An Extended Level Method for Efficient Multiple Kernel Learning , 2008, NIPS.

[13]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[14]  Zenglin Xu,et al.  Efficient Sparse Generalized Multiple Kernel Learning , 2011, IEEE Transactions on Neural Networks.