Detection of vocal fold paralysis and oedema using time-domain features and Probabilistic Neural Network

This paper proposes a feature extraction method based on time-domain energy variation for the detection of vocal fold pathology. In this work, two different vocal fold problems (vocal fold paralysis and edema) are taken for analysis and in either case, a two-class pattern recognition problem is investigated. The normal and pathological speech samples are used from Massachusetts Eye and Ear Infirmary database. Probabilistic Neural Network (PNN) is employed for the classification. The experimental results show that the proposed features give very promising classification accuracy of 90% and can be used to detect the vocal fold paralysis and edema clinically.

[1]  Rajendra U Acharya,et al.  Classification and analysis of speech abnormalities , 2005 .

[2]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[3]  Hideki Kasuya,et al.  Novel acoustic measurements of jitter and shimmer characteristics from pathological voice , 1993, EUROSPEECH.

[4]  Dimitar D. Deliyski,et al.  Acoustic model and evaluation of pathological voice production , 1993, EUROSPEECH.

[5]  Antanas Verikas,et al.  Automated speech analysis applied to laryngeal disease categorization , 2008, Comput. Methods Programs Biomed..

[6]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[7]  R. T. Ritchings,et al.  Objective assessment of pathological voice quality , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[8]  Karthikeyan Umapathy,et al.  Discrimination of pathological voices using a time-frequency approach , 2005, IEEE Transactions on Biomedical Engineering.

[9]  B Boyanov,et al.  Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. , 1997, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[10]  Pedro Gómez Vilda,et al.  Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors , 2004, IEEE Transactions on Biomedical Engineering.

[11]  Stefan Hadjitodorov,et al.  Robust hybrid pitch detector , 1993 .

[12]  Guus de Krom,et al.  A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals , 1993 .

[13]  H. Kasuya,et al.  Normalized noise energy as an acoustic measure to evaluate pathologic voice. , 1986, The Journal of the Acoustical Society of America.

[14]  Kumara Shama,et al.  Study of Harmonics-to-Noise Ratio and Critical-Band Energy Spectrum of Speech as Acoustic Indicators of Laryngeal and Voice Pathology , 2007, EURASIP J. Adv. Signal Process..

[15]  Tim Ritchings,et al.  Pathological voice quality assesment using artificial neural networks , 2001, MAVEBA.

[16]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[17]  T. Baer,et al.  Harmonics-to-noise ratio as an index of the degree of hoarseness. , 1982, The Journal of the Acoustical Society of America.

[18]  Laurene V. Fausett,et al.  Fundamentals Of Neural Networks , 1993 .

[19]  L Rufiner Hugo,et al.  Acoustic Analysis of Speech for Detection of Laryngeal Pathologies , 2000 .

[20]  Pedro Gómez Vilda,et al.  Diagnosis of vocal and voice disorders by the speech signal , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[21]  Marcelo de Oliveira Rosa,et al.  Adaptive estimation of residue signal for voice pathology diagnosis , 2000, IEEE Trans. Biomed. Eng..