Voice Pathology Detection Using Multiresolution Technique

This paper presents an automatic voice pathology detection using multiresolution technique, more specifically using Gabor wavelets. Gabor wavelets can extract information in various scales and orientations, and thereby can effectively encode distinguishable patterns of normal and pathological voice signals. First, the input voice is transformed to frequency domain using frame based Fourier transformation. 2D Gabor filters with different scale and orientation are applied on the Mel-filtered frequency representation. To reduce the dimension of Gabor features, principal component analysis is applied. These features are fed into a support vector machine for classification. In this investigation, we use two different well known databases, MEEI and SVD. The results show that the proposed method outperforms some of the state-of-the-art techniques used for voice pathology detection.

[1]  Hans Werner Strube,et al.  Glottal-to-Noise Excitation Ratio - a New Measure for Describing Pathological Voices , 1997 .

[2]  I. Titze,et al.  Comparison of Fo extraction methods for high-precision voice perturbation measurements. , 1993, Journal of speech and hearing research.

[3]  D. Jamieson,et al.  Identification of pathological voices using glottal noise measures. , 2000, Journal of speech, language, and hearing research : JSLHR.

[4]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[5]  Pedro Gómez Vilda,et al.  Methodological issues in the development of automatic systems for voice pathology detection , 2006, Biomed. Signal Process. Control..

[6]  Farshad Almasganj,et al.  Wavelet adaptation for automatic voice disorders sorting , 2013, Comput. Biol. Medicine.

[7]  Yannis Stylianou,et al.  Voice Pathology Detection and Discrimination Based on Modulation Spectral Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Mansour Alsulaiman,et al.  Voice Pathology Assessment Systems for Dysphonic Patients: Detection, Classification, and Speech Recognition , 2014 .

[9]  Marc René Schädler,et al.  Comparing Different Flavors of Spectro-Temporal Features for ASR , 2011, INTERSPEECH.

[10]  Douglas A. Reynolds,et al.  Corpora for the evaluation of speaker recognition systems , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Muhammad Ghulam,et al.  Pathological voice detection and binary classification using MPEG-7 audio features , 2014, Biomed. Signal Process. Control..

[13]  Eduardo Lleida,et al.  Voice Pathology Detection on the Saarbrücken Voice Database with Calibration and Fusion of Scores Using MultiFocal Toolkit , 2012, IberSPEECH.

[14]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[15]  Sridhar Krishnan,et al.  Pathological speech signal analysis and classification using empirical mode decomposition , 2013, Medical & Biological Engineering & Computing.

[16]  Ghulam Muhammad,et al.  Multidirectional regression (MDR)-based features for automatic voice disorder detection. , 2012, Journal of voice : official journal of the Voice Foundation.

[17]  Bernd T. Meyer,et al.  Spectro-temporal Gabor features for speaker recognition , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[18]  Germán Castellanos-Domínguez,et al.  Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients , 2011, IEEE Transactions on Biomedical Engineering.

[19]  Zehang Sun,et al.  Monocular precrash vehicle detection: features and classifiers , 2006, IEEE Transactions on Image Processing.