Recurrent type-2 fuzzy neural network using Haar wavelet energy and entropy features for speech detection in noisy environments

Highlights? A new detection feature for noisy speech detection. ? Application of a type-2 recurrent fuzzy system to noisy speech detection. ? A good speech detection result in noisy environments. This paper proposes a new method to detect the boundary of speech in noisy environments. This detection method uses Haar wavelet energy and entropy (HWEE) as detection features. The Haar wavelet energy (HWE) is derived by using the robust band that shows the most significant difference between speech and nonspeech segments at different noise levels. Similarly, the wavelet energy entropy (WEE) is computed by selecting the two wavelet energy bands whose entropy shows the most significant speech/nonspeech difference. The HWEE features are fed as inputs to a recurrent self-evolving interval type-2 fuzzy neural network (RSEIT2FNN) for classification. The RSEIT2FNN is used because it uses type-2 fuzzy sets, which are more robust to noise than type-1 fuzzy sets. The recurrent structure in the RSEIT2FNN helps to remember the context information of a test frame. The RSEIT2FNN outputs are compared with a parameter threshold to determine whether it is a speech or nonspeech period. The HWEE-based RSEIT2FNN detection was applied to speech detection in different noisy environments with different noise levels. Comparisons with different detection methods verified the advantage of the proposed method of using HWEE.

[1]  Chia-Feng Juang,et al.  Reinforcement Ant Optimized Fuzzy Controller for Mobile-Robot Wall-Following Control , 2009, IEEE Transactions on Industrial Electronics.

[2]  Chin-Teng Lin,et al.  Word boundary detection with mel-scale frequency bank in noisy environment , 2000, IEEE Trans. Speech Audio Process..

[3]  Chin-Teng Lin,et al.  A recurrent neural fuzzy network for word boundary detection in variable noise-level environments , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Francesco Beritelli,et al.  A robust voice activity detector for wireless communications using soft computing , 1998, IEEE J. Sel. Areas Commun..

[5]  Herman J. M. Steeneken,et al.  Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems , 1993, Speech Commun..

[6]  Xueying Zhang,et al.  A Speech Endpoint Detection Method Based on Wavelet Coefficient Variance and Sub-Band Amplitude Variance , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[7]  Jean-Claude Junqua,et al.  A robust algorithm for word boundary detection in the presence of noise , 1994, IEEE Trans. Speech Audio Process..

[8]  Yang-Yin Lin,et al.  A Recurrent Self-Evolving Interval Type-2 Fuzzy Neural Network for Dynamic System Processing , 2009, IEEE Transactions on Fuzzy Systems.

[9]  Juraj Kacur,et al.  Speech detection in the noisy environment using wavelet transform , 2003, Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No.03EX667).

[10]  Chin-Teng Lin,et al.  A recurrent self-organizing neural fuzzy inference network , 1999, IEEE Trans. Neural Networks.

[11]  Chia-Feng Juang,et al.  A TSK-type recurrent fuzzy network for dynamic systems processing by neural network and genetic algorithms , 2002, IEEE Trans. Fuzzy Syst..

[12]  Jean Rouat,et al.  A pitch determination and voiced/unvoiced decision algorithm for noisy speech , 1995, Speech Commun..

[13]  John Mason,et al.  Robust voice activity detection using cepstral features , 1993, Proceedings of TENCON '93. IEEE Region 10 International Conference on Computers, Communications and Automation.

[14]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[15]  Chin-Teng Lin,et al.  An online self-constructing neural fuzzy inference network and its applications , 1998, IEEE Trans. Fuzzy Syst..

[16]  K.-C. Wang,et al.  Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments , 2005, IEEE Transactions on Speech and Audio Processing.

[17]  Hani Hagras,et al.  A hierarchical type-2 fuzzy logic control architecture for autonomous mobile robots , 2004, IEEE Transactions on Fuzzy Systems.

[18]  Chung-Ho Yang,et al.  A novel approach to robust speech endpoint detection in car environments , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[19]  Sungkwon Park,et al.  Voice activity detection algorithm using radial basis function network , 2004 .

[20]  Chia-Feng Juang,et al.  Speech detection in noisy environments by wavelet energy-based recurrent neural fuzzy network , 2009, Expert Syst. Appl..

[21]  Bobby R. Hunt,et al.  Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier , 1993, IEEE Trans. Speech Audio Process..

[22]  Yu-Ching Lin,et al.  Systems identification using type-2 fuzzy neural network (type-2 FNN) systems , 2003, Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation. Computational Intelligence in Robotics and Automation for the New Millennium (Cat. No.03EX694).

[23]  Mohammad Hasan Savoji,et al.  A robust algorithm for accurate endpointing of speech signals , 1989, Speech Commun..

[24]  Jia Zeng,et al.  Type-2 fuzzy hidden Markov models and their application to speech recognition , 2006, IEEE Transactions on Fuzzy Systems.