论文信息 - Vowel Recognition using Neural Networks

Vowel Recognition using Neural Networks

Summary Speech recognition techniques have been developed dramatically in recent years. Nevertheless, errors caused by environmental noise are still a serious problem in recognition. Employing algorithms to detect and follow the motion of lips have been widely used to improve the performance of speech recognition algorithms. This paper presents a novel technique to recognize vowels. Lip features extracted by using a combined method are used as input parameters to a neural network system for recognition. Accuracy of the proposed method is verified by using it to recognize 6 main Farsi vowels.

Khashayar Yaghmaie | Vahideh Sadat Sadeghi

[1] Linda G. Shapiro,et al. Computer and Robot Vision , 1991 .

[2] Jean-Luc Schwartz,et al. Comparing models for audiovisual fusion in a noisy-vowel recognition task , 1999, IEEE Trans. Speech Audio Process..

[3] Wang Rui,et al. Recognition of sequence lip images and its application , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[4] Eun-Jung Holden,et al. Lip Tracking using Pattern Matching Snakes , 2002 .

[5] Terrence J. Sejnowski,et al. Neural network models of sensory integration for improved vowel recognition , 1990, Proc. IEEE.

[6] Trent W. Lewis,et al. Lip Feature Extraction Using Red Exclusion , 2000, VIP.

[7] B.P. Yuhas,et al. Integration of acoustic and visual speech signals using neural networks , 1989, IEEE Communications Magazine.

[8] Alexander H. Waibel,et al. Modular Construction of Time-Delay Neural Networks for Speech Recognition , 1989, Neural Computation.

[9] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[10] Kazunori Sugahara,et al. Vowel recognition according to lip shapes by using neural network , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[11] Satoshi Nakamura,et al. Statistical multimodal integration for audio-visual speech processing , 2002, IEEE Trans. Neural Networks.

[12] Fionn Murtagh,et al. Cluster Dissection and Analysis: Theory, Fortran Programs, Examples. , 1986 .

[13] Makoto Amamiya,et al. Online speech-reading system for Japanese language , 2002, Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02..