Deep-learning in Identification of Vocal Pathologies

The work consists in a classification problem of four classes of vocal pathologies using one Deep Neural Network. Three groups of features extracted from speech of subjects with Dysphonia, Vocal Fold Paralysis, Laryngitis Chronica and controls were experimented. The best group of features are related with the source: relative jitter, relative shimmer, and HNR. A Deep Neural Network architecture with two levels were experimented. The first level consists in 7 estimators and second level a decision maker. In second level of the Deep Neural Network an accuracy of 39,5% is reached for a diagnosis among the 4 classes under analysis.

[1]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[2]  Felipe Teixeira,et al.  Classification of Control/Pathologic Subjects with Support Vector Machines , 2018 .

[3]  Vinay Kumar,et al.  Robbins & Cotran patologia: bases patológicas das doenças , 2016 .

[4]  João Paulo Teixeira,et al.  Algorithm for Jitter and Shimmer Measurement in Pathologic Voices , 2016 .

[5]  João Paulo Teixeira,et al.  Acoustic Analysis of Vocal Dysphonia , 2015, CENTERIS/ProjMAN/HCist.

[6]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[7]  V. Tiwari MFCC and its applications in speaker recognition , 2010 .

[8]  João Paulo Teixeira,et al.  Transfer Learning with AudioSet to Voice Pathologies Identification in Continuous Speech , 2019, CENTERIS/ProjMAN/HCist.

[9]  João Paulo Teixeira,et al.  Parameters for Vocal Acoustic Analysis - Cured Database , 2019, CENTERIS/ProjMAN/HCist.

[10]  João Paulo Ramos Teixeira,et al.  Acoustic Analysis of Chronic Laryngitis - Statistical Analysis of Sustained Speech Parameters , 2018, BIOSIGNALS.

[11]  João Paulo Teixeira,et al.  Vocal Acoustic Analysis - Classification of Dysphonic Voices with Artificial Neural Networks , 2017, CENTERIS/ProjMAN/HCist.

[12]  Felipe Teixeira,et al.  Harmonic to Noise Ratio Measurement - Selection of Window and Length , 2018, CENTERIS/ProjMAN/HCist.

[13]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[14]  Ryszard Tadeusiewicz,et al.  Acoustic analysis assessment in speech pathology detection , 2015, Int. J. Appl. Math. Comput. Sci..