Recognition of Consonant-Vowel (CV) Units of Speech in a Broadcast News Corpus Using Support Vector Machines

This paper addresses the issues in recognition of the large number of subword units of speech using support vector machines (SVMs). In conventional approaches for multi-class pattern recognition using SVMs, learning involves discrimination of each class against all the other classes. We propose a close-class-set discrimination method suitable for large-class-set pattern recognition problems. In the proposed method, learning involves discrimination of each class against a subset of classes confusable with it and included in its close-class-set. We study the effectiveness of the proposed method in reducing the complexity of multi-class pattern recognition systems based on the one-against-the rest and one-against-one approaches. We discuss the effects of symmetry and uniformity in size of the close-class-sets on the performance for these approaches. We present our studies on recognition of 86 frequently occurring Consonant-Vowel units in a continuous speech database of broadcast news.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Shigeki Sagayama,et al.  Support vector machine with dynamic time-alignment kernel for speech recognition , 2001, INTERSPEECH.

[3]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[4]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  Kazuya Takeda,et al.  Close-Class-Set Discrimination Method for Recognition of Stop_Consonant-Vowel Utterances Using Support Vector Machines , 2001, ICANN.

[7]  Joseph Picone,et al.  Support vector machines for speech recognition , 1998, ICSLP.

[8]  S. Griffis EDITOR , 1997, Journal of Navigation.

[9]  Hermann Ney,et al.  Large vocabulary continuous speech recognition of Broadcast News - The Philips/RWTH approach , 2002, Speech Commun..

[10]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[11]  David G. Stork,et al.  Pattern Classification , 1973 .

[12]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[13]  Mark J. F. Gales,et al.  Speech Recognition using SVMs , 2001, NIPS.

[14]  Joseph Picone,et al.  Syllable-based large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[15]  Shigeru Katagiri,et al.  Handbook of Neural Networks for Speech Processing , 2000 .

[16]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[17]  Kazuya Takeda,et al.  Close-class-set discrimination method for large-class-set pattern recognition using support vector machines , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[18]  Pedro J. Moreno,et al.  On the use of support vector machines for phonetic classification , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[19]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.