Discrete wavelet transforms with multiclass SVM for phoneme recognition

A phoneme recognition system based on Discrete Wavelet Transforms (DWT) and Support Vector Machines (SVMs), is designed for multi-speaker continuous speech environments. Phonemes are divided into frames, and the DWTs are adopted, to obtain fixed dimensional feature vectors. For the multiclass SVM, the One-against-one method with the RBF kernel was implemented. To further improve the accuracies obtained, a priority scheme was added, to forecast the three most likely phonemes. After classification, all frames were again re-grouped, in order to evaluate the accuracy of the system according to the substitution, deletion and insertion errors. The percentage correct and accuracy, obtained from the designed phoneme recognition system, were 63.08% and 53.27% respectively. All tests were carried out on the TIMIT database. This phoneme recognition system is intended to be implemented on a dedicated chip, to improve the speed of the software implementation by approximately 100 times.

[1]  Ahmed Ben Hamida,et al.  CDHMM parameters selection for speaker-independent phone recognition in continuous speech system , 2010, MELECON 2010.

[2]  Pedro J. Moreno,et al.  On the use of support vector machines for phonetic classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[3]  Jonathan G. Fiscus,et al.  Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[4]  Shivesh Ranjan A Discrete Wavelet Transform Based Approach to Hindi Speech Recognition , 2010, 2010 International Conference on Signal Acquisition and Processing.

[5]  Hyrum S. Anderson,et al.  Training a support vector machine to classify signals in a real environment given clean training data , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Hynek Hermansky,et al.  Multilayer perceptron with sparse hidden outputs for phoneme recognition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Ivan Grech,et al.  Support Vector Machines with the priorities method for speaker independent phoneme recognition , 2011, 2011 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT).

[8]  Andreas Wendemuth,et al.  Speech recognition with support vector machines in a hybrid system , 2005, INTERSPEECH.

[9]  Joseph Picone,et al.  Applications of support vector machines to speech recognition , 2004, IEEE Transactions on Signal Processing.

[10]  Zhang Xue-yingb Speech Recognition Based on Support Vector Machine and Error Correcting Output Codes , 2011 .

[11]  Simon King,et al.  Framewise phone classification using support vector machines , 2002, INTERSPEECH.

[12]  Balwant A. Sonkamble,et al.  An efficient use of support vector machines for speech signal classification , 2009, CI 2009.

[13]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[14]  Carmen Peláez-Moreno,et al.  SVMs for Automatic Speech Recognition: A Survey , 2005, WNSP.

[15]  Shantanu Chakrabartty,et al.  Ginisupport vector machines for segmental minimum Bayes risk decoding of continuous speech , 2007, Comput. Speech Lang..

[16]  Zekeriya Tufekci,et al.  Mel-scaled discrete wavelet coefficients for speech recognition , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[17]  Hao Tang,et al.  An initial attempt for phoneme recognition using Structured Support Vector Machine (SVM) , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Tetsunori Kobayashi,et al.  A Sequential Pattern Classifier Based on Hidden Markov Kernel Machine and Its Application to Phoneme Classification , 2010, IEEE Journal of Selected Topics in Signal Processing.

[19]  Minyue Fu,et al.  The use of wavelet transforms in phoneme recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.