erturbation measurements in highly irregular voice signals : erformances / validity of analysis software tools

Abstract In this paper we present results concerning validity of jitter measurement in strongly irregular voice signals (sustained vowels) moderately corrupted by noise. The performance of four tools for voice analysis is compared on synthetic signals as far as fundamental period and jitter estimation are concerned. Synthesised vowels offer the advantage of a perfect control of the amount of jitter put in. Though implementing the same formula for jitter estimation, the results obtained with these approaches become quite different for increasing jitter. The reason could be searched in the different methods used for the separation of voiced and unvoiced frames as well as for fundamental period estimation. Results show that all the tools give reliable results up to a jitter level J  = 15%, that encompasses the maximum value J  = 12% as obtained by expert raters by visual inspection. Hence, up to this limit, the tools presented here for jitter estimation can give a valid support to clinicians also in term of reproducibility of results and time saving. For jitter values larger than 15% all programs tend to underestimate the true jitter value, but with large differences among them. Just two methods succeed in estimating jitter values up to and larger than 20% and could thus be better suited for perturbation measure in strongly irregular voice signals.

[1]  J Kreiman,et al.  Comparison of voice analysis systems for perturbation measurement. , 1993, Journal of speech and hearing research.

[2]  平野 実 Clinical examination of voice , 1981 .

[3]  M P Karnell,et al.  Comparison of fundamental frequency and perturbation measurements among three analysis systems. , 1995, Journal of voice : official journal of the Voice Foundation.

[4]  C R Rabinov,et al.  Comparing reliability of perceptual ratings of roughness and acoustic measure of jitter. , 1995, Journal of speech and hearing research.

[5]  J P Martens,et al.  Pitch and voiced/unvoiced determination with an auditory model. , 1992, The Journal of the Acoustical Society of America.

[6]  Ilse Smits,et al.  A comparative study of acoustic voice measurements by means of Dr. Speech and Computerized Speech Lab. , 2005, Journal of voice : official journal of the Voice Foundation.

[7]  I. Titze,et al.  Comparison of Fo extraction methods for high-precision voice perturbation measurements. , 1993, Journal of speech and hearing research.

[8]  Rick M Roark,et al.  Frequency and voice: perspectives in the time domain. , 2006, Journal of voice : official journal of the Voice Foundation.

[9]  Jean Schoentgen,et al.  Shaping function models of the phonatory excitation signal. , 2003, The Journal of the Acoustical Society of America.

[10]  Paul Boersma,et al.  Should jitter be measured by peak picking or by waveform matching , 2009 .

[11]  Rabab Kreidieh Ward,et al.  Obtaining LIP and Glottal Reflection Coefficients from Vowel Sounds , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Coarticulation • Suprasegmentals,et al.  Acoustic Phonetics , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[13]  Dimitar D Deliyski,et al.  Influence of sampling rate on accuracy and reliability of acoustic voice analysis , 2005, Logopedics, phoniatrics, vocology.

[14]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[15]  Leonardo Bocchi,et al.  A multipurpose user-friendly tool for voice analysis: Application to pathological adult voices , 2009, Biomed. Signal Process. Control..

[16]  J Schoentgen Stochastic models of jitter. , 2001, The Journal of the Acoustical Society of America.

[17]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .

[18]  Ritu Sharma Speech Synthesis , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[19]  E. Hoffman,et al.  Vocal tract area functions from magnetic resonance imaging. , 1996, The Journal of the Acoustical Society of America.

[20]  Jean Schoentgen,et al.  Evaluation of a Synthesizer of Disordered Voices , 2009 .

[21]  D G Childers,et al.  Vocal quality factors: analysis, synthesis, and perception. , 1991, The Journal of the Acoustical Society of America.

[22]  D G Jamieson,et al.  A comparison of high precision F0 extraction algorithms for sustained vowels. , 1999, Journal of speech, language, and hearing research : JSLHR.

[23]  Jean Schoentgen,et al.  Perceived naturalness of a synthesizer of disordered voices , 2009, INTERSPEECH.

[24]  I. Titze The myoelastic aerodynamic theory of phonation , 2006 .

[25]  Dimitar D Deliyski,et al.  Adverse effects of environmental noise on acoustic voice quality measurements. , 2005, Journal of voice : official journal of the Voice Foundation.

[26]  Dimitar D. Deliyski,et al.  Acoustic model and evaluation of pathological voice production , 1993, EUROSPEECH.

[27]  Youri Maryn,et al.  Spectral, cepstral, and multivariate exploration of tracheoesophageal voice quality in continuous speech and sustained vowels , 2009, The Laryngoscope.

[28]  Claudia Manfredi,et al.  A new insight into postsurgical objective voice quality evaluation: application to thyroplastic medialization , 2006, IEEE Transactions on Biomedical Engineering.

[29]  C Manfredi,et al.  A comparative analysis of fundamental frequency estimation methods with application to pathological voices. , 2000, Medical engineering & physics.