Validity of jitter measures in non-quasi-periodic voices. Part I: Perceptual and computer performances in cycle pattern recognition

Abstract The limit of about 5% for reliable quantification of jitter in sustained vowels of dysphonic voices—a widely accepted guideline—deserves critical analysis. The present study pertains to the effect of experience and training on the perceptual (visual) capability of correctly identifying periods in (highly) perturbed signals, and to a comparison of the performance of several programs for voice analysis. Synthesized realistic vowels (/a:/) with exactly known jitter (2.7%–31.5%) are used as material. After selection and training, experienced raters demonstrate excellent agreement in correctly identifying periods up to high values of jitter put in. Perceptual rating outperforms all computer programs in accuracy. Most remain reliable up to 10% jitter; one of them correctly measures up to the highest level.

[1]  J. Hogg Magnetic resonance imaging. , 1994, Journal of the Royal Naval Medical Service.

[2]  Jean Schoentgen,et al.  Evaluation of a Synthesizer of Disordered Voices , 2009 .

[3]  A. Schindler,et al.  Vocal improvement after voice therapy in unilateral vocal fold paralysis. , 2008, Journal of voice : official journal of the Voice Foundation.

[4]  Rabab Kreidieh Ward,et al.  Obtaining LIP and Glottal Reflection Coefficients from Vowel Sounds , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  M. Tarr,et al.  Training ‘greeble’ experts: a framework for studying expert object recognition processes , 1998, Vision Research.

[6]  P. Lieberman Some Acoustic Measures of the Fundamental Periodicity of Normal and Pathologic Larynges , 1963 .

[7]  Rick M Roark,et al.  Frequency and voice: perspectives in the time domain. , 2006, Journal of voice : official journal of the Voice Foundation.

[8]  P H Dejonckere,et al.  Documentation of progress in voice therapy: perceptual, acoustic, and laryngostroboscopic findings pretherapy and posttherapy. , 2004, Journal of voice : official journal of the Voice Foundation.

[9]  I. Titze The myoelastic aerodynamic theory of phonation , 2006 .

[10]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[11]  Jean Schoentgen,et al.  Shaping function models of the phonatory excitation signal. , 2003, The Journal of the Acoustical Society of America.

[12]  Leonardo Bocchi,et al.  A multipurpose user-friendly tool for voice analysis: Application to pathological adult voices , 2009, Biomed. Signal Process. Control..

[13]  Ronald J. Baken,et al.  Clinical measurement of speech and voice , 1987 .

[14]  平野 実 Clinical examination of voice , 1981 .

[15]  Jean-Pierre Martens,et al.  Objective evaluation of the quality of substitution voices , 2004, European Archives of Oto-Rhino-Laryngology and Head & Neck.

[16]  J P Martens,et al.  Pitch and voiced/unvoiced determination with an auditory model. , 1992, The Journal of the Acoustical Society of America.

[17]  Eckart Altenmüller,et al.  Adaptations During the Acquisition of Expertise , 2010 .

[18]  J. DiCarlo,et al.  Learning and neural plasticity in visual object recognition , 2006, Current Opinion in Neurobiology.

[19]  N. Sutherland Outlines of a theory of visual pattern recognition in animals and man , 1968, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[20]  J Schoentgen Stochastic models of jitter. , 2001, The Journal of the Acoustical Society of America.

[21]  I. Titze,et al.  Comparison of Fo extraction methods for high-precision voice perturbation measurements. , 1993, Journal of speech and hearing research.

[22]  P. Boersma ACCURATE SHORT-TERM ANALYSIS OF THE FUNDAMENTAL FREQUENCY AND THE HARMONICS-TO-NOISE RATIO OF A SAMPLED SOUND , 1993 .

[23]  John Nicholas Holmes,et al.  Speech synthesis , 1972 .

[24]  E. Hoffman,et al.  Vocal tract area functions from magnetic resonance imaging. , 1996, The Journal of the Acoustical Society of America.

[25]  G Molenberghs,et al.  The dysphonia severity index: an objective measure of vocal quality based on a multiparameter approach. , 2000, Journal of speech, language, and hearing research : JSLHR.

[26]  Dimitar D Deliyski,et al.  Adverse effects of environmental noise on acoustic voice quality measurements. , 2005, Journal of voice : official journal of the Voice Foundation.

[27]  P. Dejonckere,et al.  A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques , 2001, European Archives of Oto-Rhino-Laryngology.

[28]  Gunnar Fant,et al.  Acoustic Theory Of Speech Production , 1960 .

[29]  F. Hilgers,et al.  Acoustical analysis and perceptual evaluation of tracheoesophageal prosthetic voice. , 1998, Journal of voice : official journal of the Voice Foundation.

[30]  Jonathan S. Abel,et al.  A SIMPLE, ACCURATE WALL LOSS FILTER FOR ACOUSTIC TUBES , 2003 .

[31]  Jean Schoentgen,et al.  Perceived naturalness of a synthesizer of disordered voices , 2009, INTERSPEECH.

[32]  Jean Schoentgen,et al.  Synthèse des voix pathologiques , 2010 .

[33]  Paul Boersma,et al.  Should jitter be measured by peak picking or by waveform matching , 2009 .