Bayesian compressive sensing for phonetic classification

In this paper, we introduce a novel bayesian compressive sensing (CS) technique for phonetic classification. CS is often used to characterize a signal from a few support training examples, similar to k-nearest neighbor (kNN) and Support Vector Machines (SVMs). However, unlike SVMs and kNNs, CS allows the number of supports to be adapted to the specific signal being characterized. On the TIMIT phonetic classification task, we find that our CS method outperforms the SVM, kNN and Gaussian Mixture Model (GMM) methods. Our CS method achieves an accuracy of 80.01%, one of the best reported result in the literature to date.

[1]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[2]  Hung-An Chang,et al.  Hierarchical large-margin Gaussian mixture models for phonetic classification , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[3]  Hsiao-Wuen Hon,et al.  Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[4]  Sunil Arya,et al.  ANN: library for approximate nearest neighbor searching , 1998 .

[5]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[6]  D. Kanevsky,et al.  ABCS : Approximate Bayesian Compressed Sensing , 2009 .

[7]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.