Recognition of human speech phonemes using a novel fuzzy approach

Recognition of human speech has long been a hot topic among artificial intelligence and signal processing researches. Most of current policies for this subject are based on extraction of precise features of voice signal and trying to make most out of them by heavy computations. But this focus on signal details has resulted in too much sensitivity to noise and as a result, the necessity of complex noise detection and removal algorithms, which composes a trade-off between fast or noise robust recognition. This paper presents a novel approach to speech recognition using fuzzy modeling and decision making that ignores noise instead of its detection and removal. To do so, the speech spectrogram is converted into a fuzzy linguistic description and this description is used instead of precise acoustic features. During the training period, a genetic algorithm finds appropriate definitions for phonemes, and when these definitions are ready, a simple novel operator consisting of low cost functions such as Max, Min, and Average makes the recognition. The approach is tested on a standard speech database and is compared with Hidden Markov model recognition system with MFCC features as a widely used speech recognition approach.

[1]  S. Howard Bartley,et al.  The relation of pitch to frequency. , 1950 .

[2]  Carol Y. Espy-Wilson,et al.  An event-based acoustic-phonetic approach for speech segmentation and E-set recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Steven E. Golowich,et al.  A Support Vector/Hidden Markov Model Approach to Phoneme Recognition , 1998 .

[4]  John McCarthy,et al.  What Computers Still Can't Do , 1996, Artif. Intell..

[5]  Patricia A. Nava A neuro-fuzzy system for speech recognition , 1998 .

[6]  Hubert L. Dreyfus,et al.  What computers still can't do - a critique of artificial reason , 1992 .

[7]  Patricia A. Nava,et al.  Speaker independent voice recognition with a fuzzy neural network , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[8]  Hak-Keung Lam,et al.  Recognition of speech commands using a modified neural fuzzy network and an improved GA , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[9]  Kuntal Sengupta,et al.  Use of a novel generalized fuzzy hidden Markov model for speech recognition , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[10]  Akinori Ito,et al.  An optimized multi-duration HMM for spontaneous speech recognition , 2003, INTERSPEECH.

[11]  L. Zadeh From Computing with Numbers to Computing with Words , 2001 .

[12]  Michael J. Watts,et al.  Phoneme-Based Speech Recognition via Fuzzy Neural Networks Modeling and Learning , 1998, Inf. Sci..

[13]  Dat Tran,et al.  Fuzzy hidden Markov models for speech and speaker recognition , 1999, 18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397).

[14]  Dirk Van Compernolle,et al.  Fast and accurate acoustic modelling with semi-continuous HMMs , 1998, Speech Commun..

[15]  Michael R. Berthold,et al.  A time delay radial basis function network for phoneme recognition , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).

[16]  Dae-Jong Lee,et al.  Speech recognition using integra-normalizer and neuro-fuzzy method , 2000, Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154).

[17]  Steven Greenberg,et al.  Syllable-proximity evaluation in automatic speech recognition using fuzzy measures and a fuzzy integral , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[18]  Steven Greenberg,et al.  Application of Fuzzy-Integration-Based Multiple-Information Aggregation in Automatic Speech Recognition , 2004 .

[19]  Hubert L. Dreyfus,et al.  A Critique of Artificial Reason , 1968 .

[20]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[21]  S. S. Stevens,et al.  The Relation of Pitch to Frequency: A Revised Scale , 1940 .

[22]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[23]  T. R. Sontakke,et al.  Speech recognition using modified fuzzy hypersphere neural network , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[24]  P. Mills,et al.  Fuzzy logic enhanced symmetric dynamic programming for speech recognition , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[25]  Nikola K. Kasabov,et al.  Hybrid system for robust recognition of noisy speech based on evolving fuzzy neural networks and adaptive filtering , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[26]  Oh-Wook Kwon,et al.  Phoneme recognition using ICA-based feature extraction and transformation , 2004, Signal Process..