Speech recognition using a wavelet packet adaptive network based fuzzy inference system

Abstract In this paper, an expert speech recognition system is presented. This paper especially deals with the combination of feature extraction and classification for real speech signals. A Wavelet packet adaptive network based fuzzy inference system (WPANFIS) model is developed in this study. WPANFIS consists of two layers: wavelet packet and adaptive network based fuzzy inference system. The wavelet packet layer is used for adaptive feature extraction in the time–frequency domain and is composed of wavelet packet decomposition and wavelet packet entropy. The performance of the developed system is evaluated by using noisy speech signals. Test results showing the effectiveness of the proposed speech recognition system are presented in the paper. The rate of correct classification is about 92% for the sample speech signals.

[1]  Shubha L. Kadambe,et al.  Applications of adaptive wavelets for speech , 1994 .

[2]  S. Mallat A wavelet tour of signal processing , 1998 .

[3]  Ramesh A. Gopinath,et al.  Wavelets and Wavelet Transforms , 1998 .

[4]  Stéphane Mallat,et al.  Characterization of Signals from Multiscale Edges , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Stéphane Mallat,et al.  Zero-crossings of a wavelet transform , 1991, IEEE Trans. Inf. Theory.

[6]  Patricia A. Nava,et al.  Speaker independent voice recognition with a fuzzy neural network , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[7]  C. Burrus,et al.  Introduction to Wavelets and Wavelet Transforms: A Primer , 1997 .

[8]  R. Coifman,et al.  Local feature extraction and its applications using a library of bases , 1994 .

[9]  Zuliang Shen Fuzzy sets and applications: Selected papers by L.A. Zadeh: R.R. Yager, S. Ovchinnikov, R.M. Tong and H.T. Nguyen, eds.☆ , 1993 .

[10]  J.H.L. Hansen,et al.  High resolution speech feature parametrization for monophone-based stressed speech recognition , 2000, IEEE Signal Processing Letters.

[11]  Lawrence R. Rabiner,et al.  Applications of voice processing to telecommunications , 1994, Proc. IEEE.

[12]  Harold H. Szu,et al.  Review of wavelet transforms for pattern recognitions , 1996, Defense + Commercial Sensing.

[13]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[14]  J. N. Gowdy,et al.  Feature extraction using discrete wavelet transform for speech recognition , 2000, Proceedings of the IEEE SoutheastCon 2000. 'Preparing for The New Millennium' (Cat. No.00CH37105).

[15]  Engin Avci,et al.  Intelligent target recognition based on wavelet packet neural network , 2005, Expert Syst. Appl..

[16]  M. Victor Wickerhauser,et al.  Adapted local trigonometric transforms and speech processing , 1993, IEEE Trans. Signal Process..

[17]  Ebrahim H. Mamdani,et al.  Fuzzy sets and applications: selected papers by L A Zadeh, R R Yager, S Ovchinikov, R M Tong, H T Nguyen (eds) John Wiley and Sons Inc, £45.85, ISBN 0 471 85710 6, 684pp , 1988, Knowl. Based Syst..

[18]  L. Zadeh,et al.  Fuzzy sets and applications : selected papers , 1987 .

[19]  Nikola K. Kasabov,et al.  From hybrid adjustable neuro-fuzzy systems to adaptive connectionist-based systems for phoneme and word recognition , 1999, Fuzzy Sets Syst..

[20]  Sergei Ovchinnikov,et al.  Fuzzy sets and applications , 1987 .

[21]  Colin Yallop,et al.  An Introduction to Phonetics and Phonology , 1990 .

[22]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[23]  Dominique Genoud,et al.  POLYCOST: A telephone-speech database for speaker recognition , 2000, Speech Commun..

[24]  Aaron E. Rosenberg,et al.  A comparative performance study of several pitch detection algorithms , 1976 .

[25]  David L. Donoho,et al.  WaveLab and Reproducible Research , 1995 .

[26]  Christopher John Long,et al.  Wavelet based feature extraction for phoneme recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[27]  Te-Won Lee,et al.  A Spatio-Temporal Speech Enhance Speech Recogn , 2002 .

[28]  Les E. Atlas,et al.  The challenge of spoken language systems: research directions for the nineties , 1995, IEEE Trans. Speech Audio Process..

[29]  Jun Zhang,et al.  Wavelet neural networks for function learning , 1995, IEEE Trans. Signal Process..

[30]  Omar Farooq,et al.  Mel-scaled wavelet filter based features for noisy unvoiced phoneme recognition , 2002, INTERSPEECH.

[31]  A. Enis Çetin,et al.  Subband analysis for robust speech recognition in the presence of car noise , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[32]  Michel Barlaud,et al.  Image coding using wavelet transform , 1992, IEEE Trans. Image Process..

[33]  Nikos Fakotakis,et al.  Wavelet packet based speaker verification , 2004, Odyssey.

[34]  Christopher L. Scofield,et al.  Neural networks and speech processing , 1991, The Kluwer international series in engineering and computer science.