Usefulness of the Comb Filtering Output for Voiced/Unvoiced Classification and Pitch Detection

In this paper, the comb filtering output (CFO) is used to carry out voiced/unvoiced classification along with pitch detection. The usefulness of the CFO consists not only in its measure of periodicity but also in its indication of the pitch period. Given that the voicing detection is achieved by the HMM-based methods using the selected acoustic features as input, the error rate can be brought down to 0.52% in a less rigorous sense. Following the voicing determination, the pitch contour along consecutive voiced frames can be attained using the Viterbi search with the pitch candidates obtained from CFO. The Viterbi search renders a smooth pitch contour that is very appropriate for speech synthesis.

[1]  Andreas Spanias,et al.  Cepstrum-based pitch detection using a new statistical V/UV classification algorithm , 1999, IEEE Trans. Speech Audio Process..

[2]  Jean-Claude Junqua,et al.  A robust algorithm for word boundary detection in the presence of noise , 1994, IEEE Trans. Speech Audio Process..

[3]  Hynek Hermansky,et al.  Spectral entropy based feature for robust ASR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  H. T. Hu,et al.  Robust pitch estimation based on modified comb filtering approach , 2007 .

[5]  Bobby R. Hunt,et al.  Voiced-unvoiced-silence classifications of speech using hybrid features and a network classifier , 1993, IEEE Trans. Speech Audio Process..

[6]  Wei Zhang,et al.  A soft voice activity detector based on a Laplacian-Gaussian model , 2003, IEEE Trans. Speech Audio Process..

[7]  S. Gökhun Tanyer,et al.  Voice activity detection in nonstationary noise , 2000, IEEE Trans. Speech Audio Process..

[8]  Lawrence R. Rabiner,et al.  A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition , 1976 .

[9]  D. Paul The spectral envelope estimation vocoder , 1981 .

[10]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[11]  L. Siegel A procedure for using pattern classification techniques to obtain a voiced/Unvoiced classifier , 1979 .

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.