Cepstrum-Based Harmonics-to-Noise Ratio Measurement in Voiced Speech

The estimation of the harmonics-to-noise ratio (HNR) in voiced speech provides an indication of the ratio between the periodic to aperiodic components of the signal. Time-domain methods for HNR estimation are problematic because of the difficulty of estimating the period markers for (pathological) voiced speech. Frequency-domain methods encounter the problem of estimating the noise level at harmonic locations. Cepstral techniques have been introduced to supply noise estimates at all frequency locations in the spectrum. A detailed description of cepstral processing is provided in order to motivate its use as a HNR estimator. The action of cepstral low-pass liftering and subsequent Fourier transformation is shown to be analogous to the action of a moving average filter. Based on this description, short-comings of two existing cepstral-based HNRs are illustrated and a new approach is introduced and shown to provide accurate HNR measurements for synthesised glottal and voiced speech waveforms.

[1]  D. Klatt,et al.  Analysis, synthesis, and perception of voice quality variations among female and male talkers. , 1990, The Journal of the Acoustical Society of America.

[2]  Y. Qi,et al.  Temporal and spectral estimations of harmonics-to-noise ratio in human voice signals. , 1997, The Journal of the Acoustical Society of America.

[3]  A. Oppenheim Speech analysis-synthesis system based on homomorphic filtering. , 1969, The Journal of the Acoustical Society of America.

[4]  Guus de Krom,et al.  A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[5]  Donald G. Childers,et al.  Speech processing and synthesis toolboxes , 1999 .

[6]  N Yanagihara,et al.  Significance of harmonic changes and noise components in hoarseness. , 1967, Journal of speech and hearing research.

[7]  Hans Werner Strube,et al.  Glottal-to-Noise Excitation Ratio - a New Measure for Describing Pathological Voices , 1997 .

[8]  Y Kitazoe,et al.  Harmonic-intensity analysis of normal and hoarse voices. , 1984, The Journal of the Acoustical Society of America.

[9]  Y Qi Time normalization in voice analysis. , 1992, The Journal of the Acoustical Society of America.

[10]  Peter J. Murphy A cepstrum-based harmonics-to-noise ratio in voice signals , 2000, INTERSPEECH.

[11]  B Weinberg,et al.  Minimizing the effect of period determination on the computation of amplitude perturbation in voice. , 1995, The Journal of the Acoustical Society of America.

[12]  K. Kitajima,et al.  Quantitative evaluation of the noise level in the pathologic voice. , 1981, Folia phoniatrica.

[13]  Donald G. Childers,et al.  Speech Processing , 1999 .

[14]  A. Oppenheim,et al.  Homomorphic analysis of speech , 1968 .

[15]  A. Noll Cepstrum pitch determination. , 1967, The Journal of the Acoustical Society of America.

[16]  P. Murphy,et al.  Perturbation-free measurement of the harmonics-to-noise ratio in voice signals using pitch synchronous harmonic analysis. , 1999, The Journal of the Acoustical Society of America.

[17]  J. Hillenbrand,et al.  Acoustic correlates of breathy vocal quality. , 1994, Journal of speech and hearing research.

[18]  Satoshi Imaizumi Acoustic measurement of pathological voice qualities for medical purposes , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[19]  H. Kasuya,et al.  Normalized noise energy as an acoustic measure to evaluate pathologic voice. , 1986, The Journal of the Acoustical Society of America.

[20]  F. Klingholz,et al.  Quantitative spectral evaluation of shimmer and jitter. , 1985, Journal of speech and hearing research.

[21]  W J Gould,et al.  Computer analysis of hoarseness. , 1980, Acta oto-laryngologica.

[22]  L. Rabiner,et al.  System for automatic formant analysis of voiced speech. , 1970, The Journal of the Acoustical Society of America.

[23]  T. Baer,et al.  A pitch-synchronous analysis of hoarseness in running speech. , 1988, The Journal of the Acoustical Society of America.

[24]  T. Baer,et al.  Harmonics-to-noise ratio as an index of the degree of hoarseness. , 1982, The Journal of the Acoustical Society of America.

[25]  Hideki Kasuya,et al.  An adaptive comb filtering method as applied to acoustic analyses of pathological voice , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.