Combining evidence from multiple modular networks for recognition of consonant-vowel units of speech

In this paper, we present a method to combine evidence from multiple classifiers to recognize a large number of subword units of speech using small size training data sets. Grouping criteria based on phonetic description are considered, to build multiple modular networks for recognition of the large number of units. Nonlinear compression of feature vectors is carried out to obtain reduced dimensional patterns, and multiple classifiers are trained separately using the uncompressed feature vectors and compressed feature vectors. Evidence from multiple classifiers at different stages in the recognition system is combined using the sum rule. Effectiveness of the proposed method is demonstrated for recognition of isolated utterances of 145 consonant-vowel units of speech.

[1]  B. Yegnanarayana,et al.  Neural network models for preprocessing and discriminating utterances of consonant-vowel units , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[2]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[4]  Juha Karhunen,et al.  Principal component neural networks — Theory and applications , 1998, Pattern Analysis and Applications.

[5]  Bayya Yegnanarayana,et al.  A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances , 2002, IEEE Trans. Speech Audio Process..

[6]  Andreas Spanias,et al.  High-performance alphabet recognition , 1996, IEEE Trans. Speech Audio Process..