A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders.

The human voice production system is an intricate biological device capable of modulating pitch and loudness. Inherent internal and/or external factors often damage the vocal folds and result in some change of voice. The consequences are reflected in body functioning and emotional standing. Hence, it is paramount to identify voice changes at an early stage and provide the patient with an opportunity to overcome any ramification and enhance their quality of life. In this line of work, automatic detection of voice disorders using machine learning techniques plays a key role, as it is proven to help ease the process of understanding the voice disorder. In recent years, many researchers have investigated techniques for an automated system that helps clinicians with early diagnosis of voice disorders. In this paper, we present a survey of research work conducted on automatic detection of voice disorders and explore how it is able to identify the different types of voice disorders. We also analyze different databases, feature extraction techniques, and machine learning approaches used in these research works.

[1]  K. Uma Rani,et al.  Automatic detection of neurological disordered voices using mel cepstral coefficients and neural networks , 2013, 2013 IEEE Point-of-Care Healthcare Technologies (PHT).

[2]  M. VikramC.,et al.  Phoneme independent pathological voice detection using wavelet based MFCCs, GMM-SVM hybrid classifier , 2013, 2013 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[3]  Ghulam Muhammad,et al.  Vocal fold disorder detection based on continuous speech by using MFCC and GMM , 2013, 2013 7th IEEE GCC Conference and Exhibition (GCC).

[4]  Ghulam Muhammad,et al.  Automatic voice pathology detection and classification using vocal tract area irregularity , 2016 .

[5]  Ahmed Hammouch,et al.  Analysis of multiple types of voice recordings in cepstral domain using MFCC for discriminating between patients with Parkinson’s disease and healthy people , 2016, International Journal of Speech Technology.

[6]  Carlos Dias Maciel,et al.  Wavelet time-frequency analysis and least squares support vector machines for the identification of voice disorders , 2007, Comput. Biol. Medicine.

[7]  A. Poritz,et al.  Hidden Markov models: a guided tour , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[8]  Lotfi Salhi,et al.  Voice Disorders Identification Using Hybrid Approach: Wavelet Analysis and Multilayer Neural Networks , 2008 .

[9]  Antanas Verikas,et al.  Automated speech analysis applied to laryngeal disease categorization , 2008, Comput. Methods Programs Biomed..

[10]  Saloni,et al.  Processing and Analysis of Human Voice for Assessment of Parkinson Disease , 2016 .

[11]  J. R. Orozco-Arroyave,et al.  Automatic detection of Parkinson's disease using noise measures of speech , 2013, Symposium of Signals, Images and Artificial Vision - 2013: STSIVA - 2013.

[12]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[13]  Edson Cataldo,et al.  Analysis and Classification of Voice Pathologies Using Glottal Signal Parameters. , 2016, Journal of voice : official journal of the Voice Foundation.

[14]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[15]  T. Jayasree,et al.  Detection of pathological voices using discrete wavelet transform and artificial neural networks , 2017, 2017 IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS).

[16]  Hugo Cordeiro,et al.  Voice pathologies identification speech signals, features and classifiers evaluation , 2015, 2015 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA).

[17]  Jack J. Jiang,et al.  Acoustic analyses of sustained and running voices from patients with laryngeal pathologies. , 2008, Journal of voice : official journal of the Voice Foundation.

[18]  Roland Linder,et al.  Artificial neural network-based classification to screen for dysphonia using psychoacoustic scaling of acoustic voice features. , 2008, Journal of voice : official journal of the Voice Foundation.

[19]  Mohamed Dahmani,et al.  Vocal folds pathologies classification using Naïve Bayes Networks , 2017, 2017 6th International Conference on Systems and Control (ICSC).

[20]  Meisam Khalil Arjmandi,et al.  An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features , 2014, Biomed. Signal Process. Control..

[21]  Abir Smiti,et al.  An incremental method combining density clustering and support vector machines for voice pathology detection , 2017, Comput. Electr. Eng..

[22]  Mohammad Pooyan,et al.  An optimum algorithm in pathological voice quality assessment using wavelet-packet-based features, linear discriminant analysis and support vector machine , 2012, Biomed. Signal Process. Control..

[23]  Stefan Todorov Hadjitodorov,et al.  Laryngeal pathology detection by means of class-specific neural maps , 2000, IEEE Transactions on Information Technology in Biomedicine.

[24]  Farshad Almasganj,et al.  Voice Disorder Signal Classification Using M-Band Wavelets and Support Vector Machine , 2015, Circuits Syst. Signal Process..

[25]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[26]  Ghulam Muhammad,et al.  An Investigation of Multidimensional Voice Program Parameters in Three Different Databases for Voice Pathology Detection and Classification. , 2017, Journal of voice : official journal of the Voice Foundation.

[27]  Joseana Macêdo Fechine,et al.  LPC modelling and cepstral analysis applied to vocal fold pathology detection , 2008, Int. J. Funct. Informatics Pers. Medicine.

[28]  Gang Wang,et al.  An efficient hybrid kernel extreme learning machine approach for early diagnosis of Parkinson's disease , 2016, Neurocomputing.

[29]  D. Jamieson,et al.  Identification of pathological voices using glottal noise measures. , 2000, Journal of speech, language, and hearing research : JSLHR.

[30]  Tim Ritchings,et al.  Pathological voice quality assesment using artificial neural networks , 2001, MAVEBA.

[31]  Jagadish Nayak,et al.  Identification of voice disorders using speech samples , 2003, TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region.

[32]  Sridhar Krishnan,et al.  Pathological speech signal analysis and classification using empirical mode decomposition , 2013, Medical & Biological Engineering & Computing.

[33]  Stefan Hadjitodorov,et al.  A computer system for acoustic analysis of pathological voices and laryngeal diseases screening. , 2002, Medical engineering & physics.

[34]  Vrinda V. Nair,et al.  A scale invariant technique for detection of voice disorders using Modified Mellin Transform , 2016, 2016 International Conference on Emerging Technological Trends (ICETT).

[35]  Muhammad Ghulam,et al.  Pathological voice detection and binary classification using MPEG-7 audio features , 2014, Biomed. Signal Process. Control..

[36]  T. Ananthakrishna,et al.  k-means nearest neighbor classifier for voice pathology , 2004, Proceedings of the IEEE INDICON 2004. First India Annual Conference, 2004..

[37]  Ghulam Muhammad,et al.  Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model. , 2016, Journal of voice : official journal of the Voice Foundation.

[38]  Ghulam Muhammad,et al.  Voice pathology detection with MDVP parameters using Arabic voice pathology database , 2015, 2015 5th National Symposium on Information Technology: Towards New Smart World (NSITNSW).

[39]  Yannis Stylianou,et al.  Normalized modulation spectral features for cross-database voice pathology detection , 2009, INTERSPEECH.

[40]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[41]  M. Hariharan,et al.  A new hybrid intelligent system for accurate detection of Parkinson's disease , 2014, Comput. Methods Programs Biomed..

[42]  A. Neto,et al.  Classification System of Pathological Voices Using Correntropy , 2014 .

[43]  Sazali Yaacob,et al.  Feature Extraction Based on Mel-Scaled Wavelet Packet Transform for the Diagnosis of Voice Disorders , 2008 .

[44]  Igor E. Kheidorov,et al.  Vocal fold pathology detection using modified wavelet-like features and support vector machines , 2007, 2007 15th European Signal Processing Conference.

[45]  Constantine Kotropoulos,et al.  Linear Classifier with Reject Option for the Detection of Vocal Fold Paralysis and Vocal Fold Edema , 2009, EURASIP J. Adv. Signal Process..

[46]  Maria Markaki,et al.  Using modulation spectra for voice pathology detection and classification , 2009, 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[47]  Uma Rani K,et al.  Wavelet Transform Features to Hybrid Classifier for Detection of Neurological-Disordered Voices , 2017 .

[48]  L. Gavidia-Ceballos,et al.  Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection , 1996, IEEE Transactions on Biomedical Engineering.

[49]  Philip de Chazal,et al.  Telephony-based voice pathology assessment using automated speech analysis , 2006, IEEE Transactions on Biomedical Engineering.

[50]  José R. Fonseca,et al.  Spectral envelope and periodic component in classification trees for pathological voice diagnostic , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[51]  Roman Cmejla,et al.  Automatic Evaluation of Articulatory Disorders in Parkinson’s Disease , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[52]  Jesús Francisco Vargas-Bonilla,et al.  Automatic detection of parkinson's disease from continuous speech recorded in non-controlled noise conditions , 2015, INTERSPEECH.

[53]  Yannis Stylianou,et al.  Voice Pathology Detection and Discrimination Based on Modulation Spectral Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[54]  Sazali Yaacob,et al.  Detection of vocal fold paralysis and oedema using time-domain features and Probabilistic Neural Network , 2011 .

[55]  Rodrigo Capobianco Guido,et al.  Discrete wavelet transform and support vector machine applied to pathological voice signals identification , 2005, Seventh IEEE International Symposium on Multimedia (ISM'05).

[56]  Imen Hammami,et al.  Pathological voices detection using Support Vector Machine , 2016, 2016 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP).

[57]  Marcelo de Oliveira Rosa,et al.  Adaptive estimation of residue signal for voice pathology diagnosis , 2000, IEEE Trans. Biomed. Eng..

[58]  Michael Weeks,et al.  Digital Signal Processing Using Matlab And Wavelets , 2006 .

[59]  João Paulo Teixeira,et al.  Vocal Acoustic Analysis - Classification of Dysphonic Voices with Artificial Neural Networks , 2017, CENTERIS/ProjMAN/HCist.

[60]  H.L. Rufiner,et al.  Acoustic analysis of speech for detection of laryngeal pathologies , 2000, Proceedings of the 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (Cat. No.00CH37143).

[61]  Joseana Macêdo Fechine,et al.  Feature Estimation for Vocal Fold Edema Detection Using Short-Term Cepstral Analysis , 2007, 2007 IEEE 7th International Symposium on BioInformatics and BioEngineering.

[62]  Mohammad Pooyan,et al.  Identification of voice disorders using long-time features and support vector machine with different feature reduction methods. , 2011, Journal of voice : official journal of the Voice Foundation.

[63]  Ghulam Muhammad,et al.  Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions , 2018, IEEE Access.

[64]  Yannis Stylianou,et al.  Dysphonia detection based on modulation spectral features and cepstral coefficients , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[65]  Ghulam Muhammad,et al.  An Automatic Health Monitoring System for Patients Suffering From Voice Complications in Smart Cities , 2017, IEEE Access.

[66]  Aarushi Agarwal,et al.  Prediction of Parkinson's disease using speech signal with Extreme Learning Machine , 2016, 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT).

[67]  Karthikeyan Umapathy,et al.  Discrimination of pathological voices using a time-frequency approach , 2005, IEEE Transactions on Biomedical Engineering.

[68]  V Rodellar,et al.  Evaluation of voice pathology based on the estimation of vocal fold biomechanical parameters. , 2007, Journal of voice : official journal of the Voice Foundation.

[69]  Gastón Schlotthauer,et al.  Automatic diagnosis of pathological voices , 2006 .

[70]  Farshad Almasganj,et al.  Support vector wavelet adaptation for pathological voice assessment , 2011, Comput. Biol. Medicine.

[71]  Johannes A. Langendijk,et al.  Artificial neural network analysis to assess hypernasality in patients treated for oral or oropharyngeal cancer , 2011, Logopedics, phoniatrics, vocology.

[72]  P. V. S. Rao VOICE: An integrated speech recognition synthesis system for the Hindi language , 1993, Speech Commun..

[73]  B. Atal,et al.  Speech analysis and synthesis by linear prediction of the speech wave. , 1971, The Journal of the Acoustical Society of America.

[74]  Mohamed Fezari,et al.  Towards developing a voice pathologies detection system , 2014 .

[75]  Ghulam Muhammad,et al.  Multidirectional regression (MDR)-based features for automatic voice disorder detection. , 2012, Journal of voice : official journal of the Voice Foundation.

[76]  Antanas Verikas,et al.  Categorizing normal and pathological voices: automated and perceptual categorization. , 2011, Journal of voice : official journal of the Voice Foundation.

[77]  Jack J. Jiang,et al.  Perturbation and nonlinear dynamic analyses of voices from patients with unilateral laryngeal paralysis. , 2005, Journal of voice : official journal of the Voice Foundation.

[78]  Kumara Shama,et al.  Study of Harmonics-to-Noise Ratio and Critical-Band Energy Spectrum of Speech as Acoustic Indicators of Laryngeal and Voice Pathology , 2007, EURASIP J. Adv. Signal Process..

[79]  Hasan Rashidi,et al.  Efficient classification of Parkinson's disease using extreme learning machine and hybrid particle swarm optimization , 2016, 2016 4th International Conference on Control, Instrumentation, and Automation (ICCIA).

[80]  F. Almasganj,et al.  Comparison of neural networks and support vector machines applied to optimized features extracted from patients' speech signal for classification of vocal fold inflammation , 2005, Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005..

[81]  Max A. Little,et al.  Novel Speech Signal Processing Algorithms for High-Accuracy Classification of Parkinson's Disease , 2012, IEEE Transactions on Biomedical Engineering.

[82]  R. Guido,et al.  Trying different wavelets on the search for voice disorders sorting , 2005, Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, 2005. SSST '05..

[83]  Farshad Almasganj,et al.  Optimal selection of wavelet-packet-based features using genetic algorithm in pathological assessment of patients' speech signal with unilateral vocal fold paralysis , 2007, Comput. Biol. Medicine.

[84]  María Victoria Rodellar Biarge,et al.  Glottal Source biometrical signature for voice pathology detection , 2009, Speech Commun..

[85]  S. Mahalingam,et al.  Multi Parametric Voice Assessment: Sri Ramachandra University Protocol , 2012, Indian Journal of Otolaryngology and Head & Neck Surgery.

[86]  Andrzej Skalski,et al.  Voice data mining for laryngeal pathology assessment , 2016, Comput. Biol. Medicine.

[87]  John R. Deller,et al.  Automatic Classification of Laryngeal Dysfunction Using the Roots of the Digital Inverse Filter , 1980, IEEE Transactions on Biomedical Engineering.

[88]  T. Ananthakrishna,et al.  Vocal fold pathology assessment using PCA and LDA , 2013, 2013 International Conference on Intelligent Systems and Signal Processing (ISSP).

[89]  Germán Castellanos-Domínguez,et al.  An improved method for voice pathology detection by means of a HMM-based feature space transformation , 2010, Pattern Recognit..

[90]  J.H.L. Hansen,et al.  A noninvasive technique for detecting hypernasal speech using a nonlinear operator , 1996, IEEE Transactions on Biomedical Engineering.

[91]  L. R. Rabiner,et al.  Speech recognition: Statistical methods , 2006 .

[92]  P. Dejonckere,et al.  A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques , 2001, European Archives of Oto-Rhino-Laryngology.

[93]  Muhammad Ghulam,et al.  Voice pathology detection using interlaced derivative pattern on glottal source excitation , 2017, Biomed. Signal Process. Control..

[94]  Jesús Francisco Vargas-Bonilla,et al.  Characterization Methods for the Detection of Multiple Voice Disorders: Neurological, Functional, and Laryngeal Diseases , 2015, IEEE Journal of Biomedical and Health Informatics.

[95]  A P Accardo,et al.  An algorithm for the automatic differentiation between the speech of normals and patients with Friedreich's ataxia based on the short-time fractal dimension , 1998, Comput. Biol. Medicine.

[96]  Sridhar Krishnan,et al.  A Joint Time-Frequency and Matrix Decomposition Feature Extraction Methodology for Pathological Voice Classification , 2009, EURASIP J. Adv. Signal Process..

[97]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[98]  Ping Yu,et al.  Automatic Assessment of Pathological Voice Quality Using Multidimensional Acoustic Analysis Based on the GRBAS Scale , 2016, J. Signal Process. Syst..

[99]  J. Švec,et al.  Vocal dose measures: quantifying accumulated vibration exposure in vocal fold tissues. , 2003, Journal of speech, language, and hearing research : JSLHR.

[100]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[101]  Anis Ben Aicha,et al.  Cancer larynx detection using glottal flow parameters and statistical tools , 2016, 2016 International Symposium on Signal, Image, Video and Communications (ISIVC).

[102]  Shrikanth Narayanan,et al.  Feature analysis for automatic detection of pathological speech , 2002, Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society] [Engineering in Medicine and Biology.

[103]  C. Watts,et al.  Acoustic measures of phonatory improvement secondary to treatment by oral corticosteroids in a professional singer: a case report. , 2001, Journal of voice : official journal of the Voice Foundation.

[104]  Ahmed Hammouch,et al.  Discriminating Between Patients With Parkinson’s and Neurological Diseases Using Cepstral Analysis , 2016, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[105]  Germán Castellanos-Domínguez,et al.  Automatic Detection of Pathological Voices Using Complexity Measures, Noise Parameters, and Mel-Cepstral Coefficients , 2011, IEEE Transactions on Biomedical Engineering.

[106]  D. Childers,et al.  Detection of laryngeal function using speech and electroglottographic data , 1992, IEEE Transactions on Biomedical Engineering.

[107]  Jesús B. Alonso,et al.  Feature selection for spontaneous speech analysis to aid in Alzheimer's disease diagnosis: A fractal dimension approach , 2015, Comput. Speech Lang..

[108]  Resul Das,et al.  A comparison of multiple classification methods for diagnosis of Parkinson disease , 2010, Expert Syst. Appl..

[109]  Muhammad Ghulam,et al.  Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals , 2015, Journal of Medical Systems.

[110]  Ingo R. Titze,et al.  Vocology: The Science and Practice of Voice Habilitation , 2010 .

[111]  Isabel Guimarães,et al.  Hierarchical Classification and System Combination for Automatically Identifying Physiological and Neuromuscular Laryngeal Pathologies. , 2017, Journal of voice : official journal of the Voice Foundation.

[112]  Minsoo Hahn,et al.  An efficient approach using HOS-based parameters in the LPC residual domain to classify breathy and rough voices , 2011, Biomed. Signal Process. Control..

[113]  Porya Salehi Using patient's speech signal for vocal ford disorders detection based on lifting scheme , 2015, 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI).

[114]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[115]  César David Paredes Crovato,et al.  The Use of Wavelet Packet Transform and Artificial Neural Networks in Analysis and Classification of Dysphonic Voices , 2007, IEEE Transactions on Biomedical Engineering.

[116]  Mohamed Fezari,et al.  Acoustic Analysis for Detection of Voice Disorders Using Adaptive Features and Classifiers , 2014 .

[117]  Hideki Kasuya,et al.  An acoustic analysis of pathological voice and its application to the evaluation of laryngeal pathology , 1986, Speech Commun..

[118]  Ahmed Hammouch,et al.  Voice assessments for detecting patients with neurological diseases using PCA and NPCA , 2017, Int. J. Speech Technol..

[119]  Jirí Mekyska,et al.  Voice Pathology Detection Using Deep Learning: a Preliminary Study , 2017, 2017 International Conference and Workshop on Bioinspired Intelligence (IWOBI).

[120]  Paulo César Cortez,et al.  Wavelet transform and artificial neural networks applied to voice disorders identification , 2011, 2011 Third World Congress on Nature and Biologically Inspired Computing.

[121]  Rajendra U Acharya,et al.  Classification and analysis of speech abnormalities , 2005 .

[122]  Ghulam Muhammad,et al.  Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions. , 2017, Journal of voice : official journal of the Voice Foundation.

[123]  S. Jothilakshmi,et al.  Automatic system to detect the type of voice pathology , 2014, Appl. Soft Comput..

[124]  Ghulam Muhammad,et al.  Voice pathology detection based on the modified voice contour and SVM , 2016, BICA 2016.