Performance of wavelet analysis and neural networks for pathological voices identification

Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital ‘RABTA’ of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.

[1]  Rajendra U Acharya,et al.  Classification and analysis of speech abnormalities , 2005 .

[2]  Jean Schoentgen,et al.  Estimation of vocal dysperiodicities in disordered connected speech by means of distant-sample bidirectional linear predictive analysis. , 2005, The Journal of the Acoustical Society of America.

[3]  Patrick Bouthemy,et al.  Joint Motion Estimation and Layer Segmentation in Transparent Image Sequences—Application to Noise Reduction in X-Ray Image Sequences , 2009, EURASIP J. Adv. Signal Process..

[4]  Jian Yang,et al.  Robust Adaptive Modified Newton Algorithm for Generalized Eigendecomposition and Its Application , 2007, EURASIP J. Adv. Signal Process..

[5]  T. Christaller,et al.  Wavelet Entropy-based Feature Extraction for Crack Detection in Sewer Pipes , 2002 .

[6]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[7]  Mohammad Bagher Shamsollahi,et al.  Multiadaptive Bionic Wavelet Transform: Application to ECG Denoising and Baseline Wandering Reduction , 2007, EURASIP J. Adv. Signal Process..

[8]  Tim Ritchings,et al.  Pathological voice quality assesment using artificial neural networks , 2001, MAVEBA.

[9]  Lamia Bouafif,et al.  Pitch detection and formant analysis of Arabic speech processing , 2001 .

[10]  Fionn Murtagh,et al.  A new entropy measure based on the wavelet transform and noise modeling [image compression] , 1998 .

[11]  B Boyanov,et al.  Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases. , 1997, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.

[12]  Thierry Dutoit,et al.  On the Use of the Correlation between Acoustic Descriptors for the Normal/Pathological Voices Discrimination , 2009, EURASIP J. Adv. Signal Process..

[13]  A Giovanni,et al.  Objective voice analysis for dysphonic patients: a multiparametric protocol including acoustic and aerodynamic measurements. , 2001, Journal of voice : official journal of the Voice Foundation.

[14]  Jean-François Bonastre,et al.  Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia) , 2005, INTERSPEECH.

[15]  Pedro Gómez Vilda,et al.  Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters , 2006, IEEE Transactions on Biomedical Engineering.

[16]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[17]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[18]  Vladimir Naumovich Vapni The Nature of Statistical Learning Theory , 1995 .

[19]  Miguel Angel Ferrer-Ballester,et al.  Support Vector Machines Applied to the Detection of Voice Disorders , 2005, NOLISP.

[20]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[21]  Qinghua Zhang,et al.  Wavelet networks , 1992, IEEE Trans. Neural Networks.

[22]  F. Hlawatsch,et al.  Oversampled cosine modulated filter banks with perfect reconstruction , 1998 .

[23]  Jean Schoentgen,et al.  Multiband frame-based acoustic cues of vocal dysperiodicities in disordered connected speech , 2006, Biomed. Signal Process. Control..

[24]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Jianglin Wang,et al.  Performance of gaussian mixture models as a classifier for pathological voice , 2006 .

[26]  Johnson I. Agbinya,et al.  Discrete wavelet transform techniques in speech processing , 1996, Proceedings of Digital Processing Applications (TENCON '96).

[27]  Engin Avci,et al.  AN AUTOMATIC SYSTEM FOR TURKISH WORD RECOGNITION USING DISCRETE WAVELET NEURAL NETWORK BASED ON ADAPTIVE ENTROPY , 2007 .