Support Vector Machines Applied to the Detection of Voice Disorders

Support Vector Machines (SVMs) have become a popular tool for discriminative classification. An exciting area of recent application of SVMs is in speech processing. In this paper discriminatively trained SVMs have been introduced as a novel approach for the automatic detection of voice impairments. SVMs have a distinctly different modelling strategy in the detection of voice impairments problem, compared to other methods found in the literature (such a Gaussian Mixture or Hidden Markov Models): the SVM models the boundary between the classes instead of modelling the probability density of each class. In this paper it is shown that the scheme proposed fed with short-term cepstral and noise parameters can be applied for the detection of voice impairments with a good performance.

[1]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[2]  Vladimir Vapnik,et al.  An overview of statistical learning theory , 1999, IEEE Trans. Neural Networks.

[3]  B Boyanov,et al.  Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. , 1997, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[4]  D. Jamieson,et al.  Identification of pathological voices using glottal noise measures. , 2000, Journal of speech, language, and hearing research : JSLHR.

[5]  D. Childers,et al.  Detection of laryngeal function using speech and electroglottographic data , 1992, IEEE Transactions on Biomedical Engineering.

[6]  E. Yumoto,et al.  Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness. , 1984, Journal of speech and hearing research.

[7]  Hans Werner Strube,et al.  Glottal-to-Noise Excitation Ratio - a New Measure for Describing Pathological Voices , 1997 .

[8]  W S Winholtz,et al.  Vocal tremor analysis with the Vocal Demodulator. , 1992, Journal of speech and hearing research.

[9]  Guus de Krom,et al.  A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[10]  S. Feijóo,et al.  Short-term stability measures for the evaluation of vocal quality. , 1990, Journal of speech and hearing research.

[11]  C Manfredi,et al.  A comparative analysis of fundamental frequency estimation methods with application to pathological voices. , 2000, Medical engineering & physics.

[12]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[13]  Stefan Todorov Hadjitodorov,et al.  Laryngeal pathology detection by means of class-specific neural maps , 2000, IEEE Transactions on Information Technology in Biomedicine.

[14]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[15]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[16]  Stefan Hadjitodorov,et al.  ACOUSTIC ANALYSIS OF PATHOLOGICAL VOICES , 1997 .

[17]  Ronald J. Baken,et al.  Clinical measurement of speech and voice , 1987 .

[18]  Dimitar D. Deliyski,et al.  Acoustic model and evaluation of pathological voice production , 1993, EUROSPEECH.

[19]  John H. L. Hansen,et al.  A comparative study of traditional and newly proposed features for recognition of speech under stress , 2000, IEEE Trans. Speech Audio Process..

[20]  H. Kasuya,et al.  Normalized noise energy as an acoustic measure to evaluate pathologic voice. , 1986, The Journal of the Acoustical Society of America.

[21]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[22]  John H. L. Hansen,et al.  Discrete-Time Processing of Speech Signals , 1993 .

[23]  Friedrich S. Brodnitz,et al.  Clinical Voice Disorders: An Interdisciplinary Approach , 1986 .

[24]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[25]  Arnold E. Aronson,et al.  Clinical Voice Disorders: An Interdisciplinary Approach , 1980 .

[26]  L. Gavidia-Ceballos,et al.  Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection , 1996, IEEE Transactions on Biomedical Engineering.