Simulating online compensation for pitch-shifted auditory feedback with target approximation model

This study attempts to achieve modeling simulation of the well-known phenomenon of online compensation for pitch-shifted auditory feedback. We used the Target Approximation (TA) model as the underlying kinematic mechanism of pitch contour generation, and simulated feedback compensation through responsive perturbation of the height parameter of the TA model. Results show that both withinsyllable and cross-syllable pitch compensation in disyllabic utterances can be replicated. Furthermore, our data analysis also revealed an over-rectification phenomenon. By adjusting the height parameter back and beyond its original value after the compensation, the over-rectification was also replicated, further improving the overall simulation results.

[1]  Jay J Bauer,et al.  Voice responses to changes in pitch of voice or tone auditory feedback. , 2005, The Journal of the Acoustical Society of America.

[2]  K. Munhall,et al.  Compensation following real-time manipulation of formants in isolated vowels. , 2006, The Journal of the Acoustical Society of America.

[3]  T A Burnett,et al.  Comparison of voice F0 responses to pitch-shift onset and offset conditions. , 2001, The Journal of the Acoustical Society of America.

[4]  C. Larson,et al.  Audio-vocal responses to repetitive pitch-shift stimulation during a sustained vocalization: improvements in methodology for the pitch-shifting technique. , 2003, The Journal of the Acoustical Society of America.

[5]  Thomas M Donath,et al.  Control of voice fundamental frequency in speaking versus singing. , 2003, The Journal of the Acoustical Society of America.

[6]  Joseph S. Perkell,et al.  Movement goals and feedback and feedforward control mechanisms in speech production , 2012, Journal of Neurolinguistics.

[7]  C. Larson,et al.  Voice F0 responses to pitch-shifted voice feedback during English speech. , 2007, The Journal of the Acoustical Society of America.

[8]  Michael I. Jordan,et al.  Sensorimotor adaptation of speech I: Compensation and adaptation. , 2002, Journal of speech, language, and hearing research : JSLHR.

[9]  Yi Xu,et al.  Compensation for pitch-shifted auditory feedback during the production of Mandarin tone sequences. , 2004, The Journal of the Acoustical Society of America.

[10]  Santitham Prom-on,et al.  Toward invariant functional representations of variable surface fundamental frequency contours: Synthesizing speech melody via model-based stochastic learning , 2014, Speech Commun..

[11]  Werner Verhelst,et al.  An overlap-add technique based on waveform similarity (WSOLA) for high quality time-scale modification of speech , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Eric Moulines,et al.  Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones , 1989, Speech Commun..

[13]  J. Perkell,et al.  Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. , 2007, The Journal of the Acoustical Society of America.

[14]  Charles R Larson,et al.  Early pitch-shift response is active in both steady and dynamic voice pitch control. , 2002, The Journal of the Acoustical Society of America.

[15]  Santitham Prom-on,et al.  Modeling tone and intonation in Mandarin and English as a process of target approximation. , 2009, The Journal of the Acoustical Society of America.

[16]  Yi Xu,et al.  A Simplified Method of Learning Underlying Articulatory Pitch Target , 2014 .

[17]  Ewen N. MacDonald,et al.  Compensations in response to real-time formant perturbations of different magnitudes. , 2010, The Journal of the Acoustical Society of America.

[18]  Thomas M Donath,et al.  Effects of frequency-shifted auditory feedback on voice F0 contours in syllables. , 2002, The Journal of the Acoustical Society of America.

[19]  Satrajit S. Ghosh,et al.  Focal Manipulations of Formant Trajectories Reveal a Role of Auditory Feedback in the Online Control of Both Within-Syllable and Between-Syllable Speech Timing , 2011, The Journal of Neuroscience.

[20]  C. Larson,et al.  Instructing subjects to make a voluntary response reveals the presence of two components to the audio-vocal reflex , 1999, Experimental Brain Research.

[21]  Hanjun Liu,et al.  Effects of perturbation magnitude and voice F0 level on the pitch-shift reflex. , 2007, The Journal of the Acoustical Society of America.

[22]  Satrajit S. Ghosh,et al.  Neural modeling and imaging of the cortical interactions underlying syllable production , 2006, Brain and Language.

[23]  Kevin G Munhall,et al.  Adaptive control of vowel formant frequency: evidence from real-time formant manipulation. , 2006, The Journal of the Acoustical Society of America.

[24]  Michael I. Jordan,et al.  Sensorimotor adaptation in speech production. , 1998, Science.

[25]  Emily Q. Wang,et al.  Pitch targets and their realization: Evidence from Mandarin Chinese , 2001, Speech Commun..

[26]  Jason A. Tourville,et al.  Neural mechanisms underlying auditory feedback control of speech , 2008, NeuroImage.

[27]  Kevin G. Munhall,et al.  The role of auditory feedback during phonation: studies of Mandarin tone production , 2002, J. Phonetics.

[28]  Ewen N. MacDonald,et al.  Talkers alter vowel production in response to real-time formant perturbation even when instructed not to compensate. , 2009, The Journal of the Acoustical Society of America.

[29]  K. Kalveram,et al.  Effects of frequency-shifted auditory feedback on fundamental frequency of long stressed and unstressed syllables. , 2001, Journal of speech, language, and hearing research : JSLHR.