论文信息 - Objective assessment of Asthenia using energy and low-to-high spectral ratio

Objective assessment of Asthenia using energy and low-to-high spectral ratio

Vocal cord vibration is the source of voiced phonemes. Voice quality depends on the nature of this vibration. Vocal cords can be damaged by infection, neck or chest injury, tumours and more serious diseases such as laryngeal cancer. This kind of physical harm can cause loss of voice quality. Voice quality assessment is required from Speech and Language Therapists (SLTs). SLTs use a well-known subjective assessment approach which is called GRBAS. GRBAS is an acronym for a five dimensional scale of measurements of voice properties which were originally recommended by the Japanese Society of Logopeadics and Phoniatrics and the European Research for clinical and research use. The properties are ‘Grade’, ‘Roughness’, ‘Breathiness’, ‘Asthenia’ and ‘Strain’. The objective assessment of the G, R, B and S properties has been well researched and can be carried out by commercial measurement equipment. However, the assessment of Asthenia has been less extensively researched. This paper concerns the objective assessment of ‘Asthenia’ using features extracted from 20 ms frames of sustained vowel /a/. We develop two regression prediction models to objectively estimate Asthenia against speech and language therapists (SLTs) scores. These regression models are ‘K nearest neighbor regression’ (KNNR) and ‘Multiple linear regression’(MLR). These new approaches for prediction of Asthenia are based on different subsets of features, different sets of data and different prediction models in comparison with previous approaches in the literature. The performance of the system has been evaluated using Normalised Root Mean Square Error (NRMSE) for each of 20 trials, taking as a reference the average score for each subject selected. The subsets of features that generate the lowest NRMSE are determined and used to evaluate the two regression models. The objective system was compared with the scoring of each individual SLT and was found to have a NRMSE, averaged over 20 trials, lower than two of them and only slightly higher than the third.

[1] A. Viera,et al. Understanding interobserver agreement: the kappa statistic. , 2005, Family medicine.

[2] Ron Kohavi,et al. Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[3] William D. Berry,et al. Multiple regression in practice , 1985 .

[4] D L Streiner,et al. Learning how to differ: agreement and reliability statistics in psychiatry. , 1995, Canadian journal of psychiatry. Revue canadienne de psychiatrie.

[5] J. D. Arias-Londoño,et al. Automatic assessment of voice signals according to the GRBAS scale using modulation spectra, Mel frequency Cepstral Coefficients and Noise parameters , 2013, Symposium of Signals, Images and Artificial Vision - 2013: STSIVA - 2013.

[6] Ron Kohavi,et al. A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[7] Jacob Cohen,et al. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. , 1968 .

[8] Antoine Giovanni,et al. Objective Voice Analysis in Dysphonic Patients: New Data Including Nonlinear Measurements , 2006, Folia Phoniatrica et Logopaedica.

[9] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[10] Shian-Shyong Tseng,et al. A two-phase feature selection method using both filter and wrapper , 1999, IEEE SMC'99 Conference Proceedings. 1999 IEEE International Conference on Systems, Man, and Cybernetics (Cat. No.99CH37028).

[11] G Molenberghs,et al. The dysphonia severity index: an objective measure of vocal quality based on a multiparameter approach. , 2000, Journal of speech, language, and hearing research : JSLHR.

[12] P. Langley. Selection of Relevant Features in Machine Learning , 1994 .

[13] Shaheen N Awan,et al. Toward the development of an objective index of dysphonia severity: A four‐factor acoustic model , 2006, Clinical linguistics & phonetics.

[14] M. Hirano,et al. Clinical Examination of Voice , 1981 .

[15] R. Hillman,et al. Consensus auditory-perceptual evaluation of voice: development of a standardized clinical protocol. , 2009, American journal of speech-language pathology.

[16] Mikel Luján,et al. Perceptual Evaluation of Voice Quality and Its Correlation with Acoustic Measurement , 2013, 2013 European Modelling Symposium.

[17] James D Garnett,et al. Perceptual evaluation of voice quality and its correlation with acoustic measurements. , 2004, Journal of voice : official journal of the Voice Foundation.

[18] David J. Sheskin,et al. Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .