Voice Disorder Identification by using Hilbert-Huang Transform (HHT) and K Nearest Neighbor (KNN).

OBJECTIVES Clinical evaluation of dysphonic voices involves a multidimensional approach, including a variety of instrumental and noninstrumental measures. Acoustic analyses provide an objective, noninvasive and intelligent measures of voice quality. Based on sound recordings, this paper proposes a new classification method of voice disorders with HHT and KNN. METHODS In this research, 12 features of each sample is calculated by HHT. Based on the algorithm of Linear Prediction Coefficient (LPCC), a sample can be characterized by 9 features. After each sample is expressed by 21 features, the classifier is constructed based on KNN. In addition, classifier based on KNN was further compared with random forest and extra trees classifiers in relation to their classification performance of voice disorder. RESULTS The experiment results revel that classifier based on KNN showed better performance than other two classifiers with accuracy rate of 93.3%, precision of 93%, recall rate of 95%, F1-score of 94% and the area of receiver operating characteristic curve is 0.976. CONCLUSIONS The method put forward in this paper can be effectively used to classify voice disorders.

[1]  Ghulam Muhammad,et al.  Investigation of Voice Pathology Detection and Classification on Different Frequency Regions Using Correlation Functions. , 2017, Journal of voice : official journal of the Voice Foundation.

[2]  Qin Wei,et al.  A comparison of patients' heart rate variability and blood flow variability during surgery based on the Hilbert-Huang Transform , 2012, Biomed. Signal Process. Control..

[3]  Ramiro Jordan,et al.  Detecting breathing rates and depth of breath using LPCs and Restricted Boltzmann Machines , 2019, Biomed. Signal Process. Control..

[4]  Ghulam Muhammad,et al.  Voice pathology detection based on the modified voice contour and SVM , 2016, BICA 2016.

[5]  Virgilijus Uloza,et al.  Exploring the feasibility of the combination of acoustic voice quality index and glottal function index for voice pathology screening , 2019, European Archives of Oto-Rhino-Laryngology.

[6]  Philip N. Garner,et al.  Representation and linking mechanisms for audio in MPEG-7 , 2000, Signal Process. Image Commun..

[7]  Edson Cataldo,et al.  Analysis and Classification of Voice Pathologies Using Glottal Signal Parameters. , 2016, Journal of voice : official journal of the Voice Foundation.

[8]  Siva Ramakrishna Madeti,et al.  Modeling of PV system based on experimental data for fault detection using kNN method , 2018, Solar Energy.

[9]  Fulei Chu,et al.  HHT-based AE characteristics of natural fatigue cracks in rotating shafts , 2012 .

[10]  Lili Chen,et al.  Feature Extraction and Classification of EHG between Pregnancy and Labour Group Using Hilbert-Huang Transform and Extreme Learning Machine , 2017, Comput. Math. Methods Medicine.

[11]  Haitao Zhang,et al.  A new machine vision real-time detection system for liquid impurities based on dynamic morphological characteristic analysis and machine learning , 2018, Measurement.

[12]  A. Schindler,et al.  Prevalence and Voice Characteristics of Laryngeal Pathology in an Italian Voice Therapy-seeking Population. , 2016, Journal of voice : official journal of the Voice Foundation.

[13]  Bin Dong Characterizing resonant component in speech: A different view of tracking fundamental frequency , 2017 .

[14]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[15]  Feng Qian,et al.  Application Research of HHT-IF Speech Feature Parameter in Speaker Recognition System , 2012 .

[16]  Marina Englert,et al.  Acoustic Voice Quality Index and Acoustic Breathiness Index: Analysis With Different Speech Material in the Brazilian Portuguese. , 2020, Journal of voice : official journal of the Voice Foundation.

[17]  R. Fonseca-Pinto,et al.  Screening of obstructive sleep apnea using Hilbert-Huang decomposition of oronasal airway pressure recordings. , 2010, Medical engineering & physics.

[18]  Norimar Hernandes Dias,et al.  Voice Disorders: Etiology and Diagnosis. , 2016, Journal of voice : official journal of the Voice Foundation.

[19]  Yannis Stylianou,et al.  Voice Pathology Detection and Discrimination Based on Modulation Spectral Features , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[20]  A Gelzinis,et al.  Data dependent random forest applied to screening for laryngeal disorders through analysis of sustained phonation: acoustic versus contact microphone. , 2015, Medical engineering & physics.

[21]  Tze Fen Li,et al.  A simple statistical speech recognition of mandarin monosyllables , 2006, Appl. Math. Comput..

[22]  Sazali Yaacob,et al.  Classification of speech dysfluencies with MFCC and LPCC features , 2012, Expert Syst. Appl..

[23]  Friedman Shirley,et al.  The role of laryngeal ultrasound in the assessment of pediatric dysphonia and stridor. , 2019, International journal of pediatric otorhinolaryngology.

[24]  Surendra Shetty,et al.  A Survey on Machine Learning Approaches for Automatic Detection of Voice Disorders. , 2019, Journal of voice : official journal of the Voice Foundation.

[25]  Ahmed Ghoneim,et al.  Dysphonia Detection Index (DDI): A New Multi-Parametric Marker to Evaluate Voice Quality , 2019, IEEE Access.

[26]  Jae-Woo Chang,et al.  A secure kNN query processing algorithm using homomorphic encryption on outsourced database , 2017, Data Knowl. Eng..

[27]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[28]  Juan Ignacio Godino-Llorente,et al.  Cepstral peak prominence: A comprehensive analysis , 2014, Biomed. Signal Process. Control..

[29]  A. Salehi,et al.  A Cepstral Analysis of Normal and Pathologic Voice Qualities in Iranian Adults: A Comparative Study. , 2017, Journal of voice : official journal of the Voice Foundation.

[30]  M. Schuster,et al.  Multiparametric analysis of vocal fold vibrations in healthy and disordered voices in high-speed imaging. , 2011, Journal of voice : official journal of the Voice Foundation.

[31]  Farshad Almasganj,et al.  Support vector wavelet adaptation for pathological voice assessment , 2011, Comput. Biol. Medicine.

[32]  Chris H. Q. Ding,et al.  A Nonnegative Locally Linear KNN model for image recognition , 2018, Pattern Recognit..

[33]  Giuseppe De Pietro,et al.  A new database of healthy and pathological voices , 2018, Comput. Electr. Eng..

[34]  Muhammad Ghulam,et al.  Pathological voice detection and binary classification using MPEG-7 audio features , 2014, Biomed. Signal Process. Control..

[35]  C. Hartnick,et al.  Clinical and surgical implications of intraoperative optical coherence tomography imaging for benign pediatric vocal fold lesions. , 2018, International journal of pediatric otorhinolaryngology.

[36]  Ghulam Muhammad,et al.  Automatic voice pathology detection and classification using vocal tract area irregularity , 2016 .

[37]  Jianli Xiao,et al.  SVM and KNN ensemble learning for traffic incident detection , 2019, Physica A: Statistical Mechanics and its Applications.

[38]  Yi Chai,et al.  Classification of seizure based on the time-frequency image of EEG signals using HHT and SVM , 2014, Biomed. Signal Process. Control..

[39]  Monika Mittal,et al.  KNN and PCA classifier with Autoregressive modelling during different ECG signal interpretation , 2018 .

[40]  J. Baker Clinical Voice Pathology: Theory and Management , 2014 .

[41]  Marcos Faúndez-Zanuy,et al.  Investigation on LP-residual representations for speaker identification , 2009, Pattern Recognit..

[42]  Hai Huang,et al.  Speech pitch determination based on Hilbert-Huang transform , 2006, Signal Process..

[43]  Ryutaro Tanaka,et al.  Application of Hilbert–Huang transform for vibration signal analysis in end-milling , 2018, Precision Engineering.

[44]  Xinqun Zhu,et al.  Time-varying system identification using a newly improved HHT algorithm , 2009 .

[45]  Pawel Strumillo,et al.  Real-time estimation of the spectral parameters of Heart Rate Variability , 2015 .

[46]  N. Matsushiro,et al.  Intertext Variability of Smoothed Cepstral Peak Prominence, Methods to Control It, and Its Diagnostic Properties. , 2020, Journal of voice : official journal of the Voice Foundation.

[47]  Yuesheng Xu,et al.  A B-spline approach for empirical mode decompositions , 2006, Adv. Comput. Math..

[48]  Yansong Wang,et al.  Research and Comparison of Time-frequency Techniques for Nonstationary Signals , 2012, J. Comput..

[49]  Li Zhu,et al.  Speaker Recognition System Based on weighted feature parameter , 2012 .