On the assessment and evaluation of voice hoarseness

Abstract This article presents a non-invasive speech processing method for the assessment and evaluation of voice hoarseness. A technique based on time-scale analysis of the voice signal is used to decompose the signal into a suitable number of high-frequency details and extract the high-frequency bands of the signal. A discriminating measure, which measures the roll-off in power in the high-frequency bands of the signal, with respect to the decomposition index, is developed. The measure reflects the presence and degree of severity of hoarseness in the analyzed voice signals. The discriminating measure is supported by frequency-domain and time-series analyses of the high-frequency bands of normal and hoarse voice signals to provide a visual aid to the clinician or therapist. A database of sustained long vowels of normal and hoarse voices is created and used to assess the presence and degree of severity of hoarseness. The results obtained by the proposed method are compared to results obtained by perturbation analysis.

[1]  Rodrigo Capobianco Guido,et al.  Discrete wavelet transform and support vector machine applied to pathological voice signals identification , 2005, Seventh IEEE International Symposium on Multimedia (ISM'05).

[2]  Jody Kreiman,et al.  Comparing reliability of perceptual and acoustic measures of voice , 1994 .

[3]  J C Lucero,et al.  Time normalization of voice signals using functional data analysis. , 2000, The Journal of the Acoustical Society of America.

[4]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  P. Milenkovic,et al.  Least mean square measures of voice perturbation. , 1987, Journal of speech and hearing research.

[6]  Y Qi Time normalization in voice analysis. , 1992, The Journal of the Acoustical Society of America.

[7]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[8]  H. Kasuya,et al.  Normalized noise energy as an acoustic measure to evaluate pathologic voice. , 1986, The Journal of the Acoustical Society of America.

[9]  D. Berry,et al.  Analysis of vocal disorders with methods from nonlinear dynamics. , 1994, Journal of speech and hearing research.

[10]  Douglas E. Sturim,et al.  Automatic dysphonia recognition using biologically-inspired amplitude-modulation features , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[11]  I. Titze Vocal fold physiology : frontiers in basic science , 1993 .

[12]  Hanspeter Herzel,et al.  CHAOS AND BIFURCATIONS DURING VOICED SPEECH , 1991 .

[13]  R. J. Baken Irregularity of vocal period and amplitude: A first approach to the fractal analysis of voice , 1990 .

[14]  N Yanagihara,et al.  Significance of harmonic changes and noise components in hoarseness. , 1967, Journal of speech and hearing research.

[15]  A. Gray,et al.  Least squares glottal inverse filtering from the acoustic speech waveform , 1979 .

[16]  T. Baer,et al.  Harmonics-to-noise ratio as an index of the degree of hoarseness. , 1982, The Journal of the Acoustical Society of America.

[17]  Jack J. Jiang,et al.  Comparison of nonlinear dynamic methods and perturbation methods for voice analysis. , 2005, The Journal of the Acoustical Society of America.

[18]  I. Titze,et al.  Comparison of Fo extraction methods for high-precision voice perturbation measurements. , 1993, Journal of speech and hearing research.

[19]  I. Titze,et al.  Dependence of phonatory effort on hydration level. , 1994, Journal of speech and hearing research.

[20]  Paavo Alku,et al.  Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering , 1991, Speech Commun..

[21]  M. Rothenberg A new inverse-filtering technique for deriving the glottal air flow waveform during voicing. , 1970, The Journal of the Acoustical Society of America.

[22]  L. Gavidia-Ceballos,et al.  A nonlinear operator-based speech feature analysis method with application to vocal fold pathology assessment , 1998, IEEE Transactions on Biomedical Engineering.

[23]  Ronald W. Schafer,et al.  Digital Processing of Speech Signals , 1978 .

[24]  Truong Q. Nguyen,et al.  Wavelets and filter banks , 1996 .

[25]  Karthikeyan Umapathy,et al.  Feature analysis of pathological speech signals using local discriminant bases technique , 2006, Medical and Biological Engineering and Computing.

[26]  D. Berry,et al.  Interpretation of biomechanical simulations of normal and chaotic vocal fold oscillations with empirical eigenfunctions. , 1994, The Journal of the Acoustical Society of America.