Study of Harmonics-to-Noise Ratio and Critical-Band Energy Spectrum of Speech as Acoustic Indicators of Laryngeal and Voice Pathology

Acoustic analysis of speech signals is a noninvasive technique that has been proved to be an effective tool for the objective support of vocal and voice disease screening. In the present study acoustic analysis of sustained vowels is considered. A simple-means nearest neighbor classifier is designed to test the efficacy of a harmonics-to-noise ratio (HNR) measure and the critical-band energy spectrum of the voiced speech signal as tools for the detection of laryngeal pathologies. It groups the given voice signal sample into pathologic and normal. The voiced speech signal is decomposed into harmonic and noise components using an iterative signal extrapolation algorithm. The HNRs at four different frequency bands are estimated and used as features. Voiced speech is also filtered with 21 critical-bandpass filters that mimic the human auditory neurons. Normalized energies of these filter outputs are used as another set of features. The results obtained have shown that the HNR and the critical-band energy spectrum can be used to correlate laryngeal pathology and voice alteration, using previously classified voice samples. This method could be an additional acoustic indicator that supplements the clinical diagnostic features for voice evaluation.

[1]  B Boyanov,et al.  Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. , 1997, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[2]  William M. Hartmann,et al.  Psychoacoustics: Facts and Models , 2001 .

[3]  Raymond D. Kent,et al.  Acoustic Analysis of Speech , 2009 .

[4]  Thomas Quatieri,et al.  Discrete-Time Speech Signal Processing: Principles and Practice , 2001 .

[5]  S. Hibi,et al.  Relationship between aerodynamic, vibratory, acoustic and psychoacoustic correlates in dysphonia , 1986 .

[6]  Fabrice Plante,et al.  Speech monitoring of infective laryngitis , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[7]  L. Gavidia-Ceballos,et al.  Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection , 1996, IEEE Transactions on Biomedical Engineering.

[8]  I R Titze,et al.  Unification of perturbation measures in speech signals. , 1990, The Journal of the Acoustical Society of America.

[9]  H. Kasuya,et al.  Normalized noise energy as an acoustic measure to evaluate pathologic voice. , 1986, The Journal of the Acoustical Society of America.

[10]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[11]  Marcelo de Oliveira Rosa,et al.  Adaptive estimation of residue signal for voice pathology diagnosis , 2000, IEEE Trans. Biomed. Eng..

[12]  T. Baer,et al.  Harmonics-to-noise ratio as an index of the degree of hoarseness. , 1982, The Journal of the Acoustical Society of America.

[13]  Christophe d'Alessandro,et al.  An iterative algorithm for decomposition of speech signals into periodic and aperiodic components , 1998, IEEE Trans. Speech Audio Process..

[14]  Jhing-Fa Wang,et al.  Noise-robust pitch detection method using wavelet transform with aliasing compensation , 2002 .

[15]  Miguel Angel Ferrer-Ballester,et al.  Automatic Detection of Pathologies in The Voice by HOS Based Parameters , 2001, EURASIP J. Adv. Signal Process..

[16]  Stéphane Mallat,et al.  Characterization of Signals from Multiscale Edges , 2011, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  T. Ananthakrishna,et al.  k-means nearest neighbor classifier for voice pathology , 2004, Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004..

[18]  L. Gavidia-Ceballos,et al.  A nonlinear operator-based speech feature analysis method with application to vocal fold pathology assessment , 1998, IEEE Transactions on Biomedical Engineering.

[19]  J. Koufman,et al.  Functional voice disorders. , 1991, Otolaryngologic clinics of North America.

[20]  平野 実,et al.  Vocal fold physiology : voice quality control , 1995 .

[21]  Ingo R. Titze,et al.  Principles of voice production , 1994 .

[22]  D Michaelis,et al.  Selection and combination of acoustic features for the description of pathologic voices. , 1998, The Journal of the Acoustical Society of America.

[23]  Karthikeyan Umapathy,et al.  Discrimination of pathological voices using a time-frequency approach , 2005, IEEE Transactions on Biomedical Engineering.

[24]  Stefan Hadjitodorov,et al.  ACOUSTIC ANALYSIS OF PATHOLOGICAL VOICES , 1997 .

[25]  C. Wendt,et al.  Pitch determination and speech segmentation using the discrete wavelet transform , 1996, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96.

[26]  D. R. Boone The Voice and Voice Therapy , 1971 .

[27]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[28]  Ronald J. Baken,et al.  Clinical measurement of speech and voice , 1987 .

[29]  Claudia Manfredi,et al.  Adaptive noise energy estimation in pathological speech signals , 2000, IEEE Transactions on Biomedical Engineering.

[30]  R.N. Bracewell,et al.  Signal analysis , 1978, Proceedings of the IEEE.

[31]  Hans Werner Strube,et al.  Glottal-to-Noise Excitation Ratio - a New Measure for Describing Pathological Voices , 1997 .

[32]  Ulf Grenander,et al.  Pattern analysis , 1978, Lectures in pattern theory / U. Grenander.