A V/UV Speech Detection based on Characterization of Background Noise

The paper presents an adaptive system for voiced/unvoiced (V/UV) speech detection in the presence of background noise. Genetic algorithms were used to select the features that offer the best V/UV detection according to the output of a background noise classifier (NC) and a signal to noise ratio estimation (SNRE) system. The system was implemented and the tests performed using the TIMIT speech corpus and its phonetic classification. The results were compared with a non-adaptive classification system and the V/UV detectors adopted by three important speech coding standards: LPC10, ITU-T G.723.1 and ETSI AMR. In all cases the adaptive V/UV classifier outperformed the traditional solutions.

[1]  Masakiyo Fujimoto,et al.  A voice activity detection based on the adaptive integration of multiple speech features and a signal decision scheme , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[3]  M.N.S. Swamy,et al.  An improved voice activity detection using higher order statistics , 2005, IEEE Transactions on Speech and Audio Processing.

[4]  H. Kobatake Optimization of voiced/Unvoiced decisions in nonstationary noise environments , 1987, IEEE Trans. Acoust. Speech Signal Process..

[5]  Takao Kobayashi,et al.  Voiced/unvoiced determination of speech signal in noisy environment using harmonicity measure based on instantaneous frequency , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[6]  Francesco Beritelli A modified CS-ACELP algorithm for variable-rate speech coding robust in noisy environments , 1999, IEEE Signal Processing Letters.

[7]  Peter Jancovic,et al.  Estimation of Voicing-Character of Speech Spectra Based on Spectral Shape , 2007, IEEE Signal Processing Letters.

[8]  Wolfgang J. Hess,et al.  Pitch and voicing determination , 1992 .

[9]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition , 1996 .

[10]  Francesco Beritelli,et al.  Adaptive V/UV speech detection based on acoustic noise estimation and classification , 2007 .

[11]  Bobby R. Hunt,et al.  Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier , 1993, IEEE Trans. Speech Audio Process..

[12]  T. V. Pham,et al.  Low-complexity and efficient classification of voiced/unvoiced/silence for noisy environments , 2006, INTERSPEECH.

[13]  Francesco Beritelli,et al.  A low-complexity speech-pause detection algorithm for communication in noisy environments , 2004, Eur. Trans. Telecommun..

[14]  Goldberg,et al.  Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.

[15]  Wei-Ping Zhu,et al.  A multifeature voiced/unvoiced decision algorithm for noisy speech , 2006, 2006 IEEE International Symposium on Circuits and Systems.

[16]  Jean-Claude Junqua,et al.  Robustness in Automatic Speech Recognition: Fundamentals and Applications , 1995 .

[17]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[18]  Andreas Spanias,et al.  Cepstrum-based pitch detection using a new statistical V/UV classification algorithm , 1999, IEEE Trans. Speech Audio Process..

[19]  Yang Gao,et al.  The SMV algorithm selected by TIA and 3GPP2 for CDMA applications , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[20]  Lawrence R. Rabiner,et al.  Application of an LPC distance measure to the voiced-unvoiced-silence detection problem , 1977 .

[21]  S. Casale,et al.  Performance evaluation and comparison of G.729/AMR/fuzzy voice activity detectors , 2002, IEEE Signal Processing Letters.

[22]  Donald G. Childers,et al.  Silent and voiced/unvoiced/mixed excitation (four-way) classification of speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[23]  S. Knorr Reliable voiced/Unvoiced decision , 1979 .

[24]  Masakiyo Fujimoto,et al.  Noise Robust Voice Activity Detection Based on Switching Kalman Filter , 2008, IEICE Trans. Inf. Syst..

[25]  L. Siegel A procedure for using pattern classification techniques to obtain a voiced/Unvoiced classifier , 1979 .

[26]  Joseph P. Campbell,et al.  Voiced/Unvoiced classification of speech with applications to the U.S. government LPC-10E algorithm , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[27]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[28]  S. Casale,et al.  Adaptive robust speech processing based on acoustic noise estimation and classification , 2005, Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005..

[29]  D. E. Goldberg,et al.  Genetic Algorithms in Search , 1989 .