Hierarchical Classification and System Combination for Automatically Identifying Physiological and Neuromuscular Laryngeal Pathologies.

OBJECTIVES Speech signal processing techniques have provided several contributions to pathologic voice identification, in which healthy and unhealthy voice samples are evaluated. A less common approach is to identify laryngeal pathologies, for which the use of a noninvasive method for pathologic voice identification is an important step forward for preliminary diagnosis. In this study, a hierarchical classifier and a combination of systems are used to improve the accuracy of a three-class identification system (healthy, physiological larynx pathologies, and neuromuscular larynx pathologies). METHOD Three main subject classes were considered: subjects with physiological larynx pathologies (vocal fold nodules and edemas: 59 samples), subjects with neuromuscular larynx pathologies (unilateral vocal fold paralysis: 59 samples), and healthy subjects (36 samples). The variables used in this study were a speech task (sustained vowel /a/ or continuous reading speech), features with or without perceptual information, and features with or without direct information about formants evaluated using single classifiers. A hierarchical classification system was designed based on this information. RESULTS The resulting system combines an analysis of continuous speech by way of the commonly used sustained vowel /a/ to obtain spectral and perceptual speech features. It achieved an accuracy of 84.4%, which represents an improvement of approximately 9% compared with the stand-alone approach. For pathologic voice identification, the accuracy obtained was 98.7%, and the identification accuracy for the two pathology classes was 81.3%. CONCLUSIONS Hierarchical classification and system combination create significant benefits and introduce a modular approach to the classification of larynx pathologies.

[1]  Kumara Shama,et al.  Study of Harmonics-to-Noise Ratio and Critical-Band Energy Spectrum of Speech as Acoustic Indicators of Laryngeal and Voice Pathology , 2007, EURASIP J. Adv. Signal Process..

[2]  José R. Fonseca,et al.  Spectral envelope and periodic component in classification trees for pathological voice diagnostic , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[3]  Carlos Dias Maciel,et al.  Relative entropy measures applied to healthy and pathological voice characterization , 2009, Appl. Math. Comput..

[4]  Shrikanth Narayanan,et al.  Feature analysis for automatic detection of pathological speech , 2002, Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] [Engineering in Medicine and Biology.

[5]  Maurílio N Vieira,et al.  On the influence of laryngeal pathologies on acoustic and electroglottographic jitter measures. , 2002, The Journal of the Acoustical Society of America.

[6]  Juan Ignacio Godino-Llorente,et al.  Automatic Detection of Laryngeal Pathologies in Records of Sustained Vowels by Means of Mel-Frequency Cepstral Coefficient Parameters and Differentiation of Patients by Sex , 2009, Folia Phoniatrica et Logopaedica.

[7]  Aaron E. Rosenberg,et al.  An improved endpoint detector for isolated word recognition , 1981 .

[8]  Luís C. Oliveira,et al.  Jitter Estimation Algorithms for Detection of Pathological Voices , 2009, EURASIP J. Adv. Signal Process..

[9]  Douglas A. Reynolds,et al.  Speaker recognition using G.729 speech codec parameters , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[10]  D. Jamieson,et al.  Acoustic discrimination of pathological voice: sustained vowels versus continuous speech. , 2001, Journal of speech, language, and hearing research : JSLHR.

[11]  J.C. Pereira,et al.  Normal versus pathological voice signals , 2009, IEEE Engineering in Medicine and Biology Magazine.

[12]  Pedro Gómez Vilda,et al.  Methodological issues in the development of automatic systems for voice pathology detection , 2006, Biomed. Signal Process. Control..

[13]  Hugo Cordeiro,et al.  Continuous Speech Classification Systems for Voice Pathologies Identification , 2015, DoCEIS.

[14]  C. M. Ribeiro,et al.  Speaker adaptation in a phonetic vocoding environment , 1999, 1999 IEEE Workshop on Speech Coding Proceedings. Model, Coders, and Error Criteria (Cat. No.99EX351).

[15]  Chih-Jen Lin,et al.  Errata to "A comparison of methods for multiclass support vector machines" , 2002, IEEE Trans. Neural Networks.

[16]  Hugo Cordeiro,et al.  Voice pathologies identification speech signals, features and classifiers evaluation , 2015, 2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[17]  Yannis Stylianou,et al.  On combining information from modulation spectra and mel-frequency cepstral coefficients for automatic detection of pathological voices , 2011, Logopedics, phoniatrics, vocology.

[18]  Ji Yeoun Lee A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices , 2012, EURASIP J. Adv. Signal Process..

[19]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[20]  Ghulam Muhammad,et al.  Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model. , 2016, Journal of voice : official journal of the Voice Foundation.

[21]  Yannis Stylianou,et al.  Voice Pathology Detection and Discrimination Based on Modulation Spectral Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[22]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[23]  S. Iwata,et al.  Periodicities of pitch perturbations in normal and pathologic larynges , 1972, The Laryngoscope.

[24]  P. Lieberman Some Acoustic Measures of the Fundamental Periodicity of Normal and Pathologic Larynges , 1963 .

[25]  Hugo Cordeiro,et al.  Speaker Characterization with MLSFs , 2006, 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop.

[26]  R. Kirschen,et al.  The Royal London Space Planning: an integration of space analysis and treatment planning: Part I: Assessing the space required to meet treatment objectives. , 2000, American journal of orthodontics and dentofacial orthopedics : official publication of the American Association of Orthodontists, its constituent societies, and the American Board of Orthodontics.

[27]  Muhammad Ghulam,et al.  Automatic voice disorder classification using vowel formants , 2011, 2011 IEEE International Conference on Multimedia and Expo.

[28]  Vahid Majidnezhad A novel hybrid of genetic algorithm and ANN for developing a high efficient method for vocal fold pathology diagnosis , 2015, EURASIP J. Audio Speech Music. Process..

[29]  Hong-Goo Kang,et al.  An Investigation of Vocal Tract Characteristics for Acoustic Discrimination of Pathological Voices , 2013, BioMed research international.

[30]  I. Titze Orkshop on Acoustic Voice Analysis Summary Statement Vv 2 Workshop on Acoustic Voice Analysis , 2022 .

[31]  Yu Zhang,et al.  Objective Acoustic Analysis of Pathological Voices from Patients with Vocal Nodules and Polyps , 2009, Folia Phoniatrica et Logopaedica.