Using modulation spectra for voice pathology detection and classification

In this paper, we consider the use of Modulation Spectra for voice pathology detection and classification. To reduce the high-dimensionality space generated by Modulation spectra we suggest the use of Higher Order Singular Value Decomposition (SVD) and we propose a feature selection algorithm based on the Mutual Information between subjective voice quality and computed features. Using SVM with a radial basis function (RBF) kernel as classifier, we conducted experiments on a database of sustained vowel recordings from healthy and pathological voices. For voice pathology detection, the suggested approach achieved a detection rate of 94.1% and an Area Under the Curve (AUC) score of 97.8%. For voice pathology classification, an average detection rate and AUC of 88.6% and 94.8%, respectively, was achieved in classifying polyp against keratosis leukoplakia, adductor spasmodic dysphonia and vocal nodules.

[1]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[2]  Ronald J. Baken,et al.  Clinical measurement of speech and voice , 1987 .

[3]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Les E. Atlas,et al.  Feasibility of Single Channel Speaker Separation Based on Modulation Frequency Analysis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[5]  Douglas A. Reynolds,et al.  Modeling of the glottal flow derivative waveform with application to speaker identification , 1999, IEEE Trans. Speech Audio Process..

[6]  Les E. Atlas,et al.  EURASIP Journal on Applied Signal Processing 2003:7, 668–675 c ○ 2003 Hindawi Publishing Corporation Joint Acoustic and Modulation Frequency , 2003 .

[7]  Hynek Hermansky,et al.  Should recognizers have ears? , 1998, Speech Commun..

[8]  B. Walden,et al.  An evaluation of residue features as correlates of voice disorders. , 1987, Journal of communication disorders.

[9]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[10]  Evelyn Abberton,et al.  Hearing and phonetic criteria in voice measurement: Clinical applications , 2008, Logopedics, phoniatrics, vocology.

[11]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[12]  Karthikeyan Umapathy,et al.  Discrimination of pathological voices using a time-frequency approach , 2005, IEEE Transactions on Biomedical Engineering.

[13]  Tomi Kinnunen,et al.  Joint Acoustic-Modulation Frequency for Speaker Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[14]  Marcelo de Oliveira Rosa,et al.  Adaptive estimation of residue signal for voice pathology diagnosis , 2000, IEEE Trans. Biomed. Eng..

[15]  D. Jamieson,et al.  Identification of pathological voices using glottal noise measures. , 2000, Journal of speech, language, and hearing research : JSLHR.

[16]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[17]  Haizhou Li,et al.  Dimension reduction of the modulation spectrogram for speaker verification , 2008, Odyssey.

[18]  F. Almasganj,et al.  Local Discriminant Wavelet Packet Basis for Voice Pathology Classification , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.