论文信息 - Learning from a tutor: Embodied speech acquisition and imitation learning

Learning from a tutor: Embodied speech acquisition and imitation learning

This work presents a new developmentally inspired data-driven framework to bootstrap speech perception and imitation abilities in interaction with a tutor. The proposed system architecture extends our work presented in [1], that implements a cascade of interconnected layers to acquire the structure of speech in terms of phones, syllables and words. Here, we show how to couple such a perceptual model with a speech imitation system that is based on an acoustic synthesizer bound to produce speech sounds with a child's voice.

Miguel Vaz | Frank Joublin | Christian Goerick | Holger Brandl

[1] Naoto Iwahashi,et al. Robots That Learn Language: Developmental Approach to Human-Machine Conversations , 2006, EELC.

[2] J. Perkell,et al. A Neural Model of Speech Production and Its Application to Studies of the Role of Auditory Feedback in Speech , 2003 .

[3] Inna Mikhailova,et al. Expectation-driven autonomous learning and interaction system , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[4] R. G. Leonard,et al. A database for speaker-independent digit recognition , 1984, ICASSP.

[5] Peter W. Jusczyk,et al. How infants begin to extract words from speech , 1999, Trends in Cognitive Sciences.

[6] Bernd J. Kröger,et al. Towards a neurocomputational model of speech production and perception , 2009, Speech Commun..

[7] A. Meltzoff,et al. Infant vocalizations in response to speech: vocal imitation and developmental change. , 1996, The Journal of the Acoustical Society of America.

[8] Ian S. Howard,et al. A Computational Model of Infant Speech Development , 2007 .

[9] 1A2-L07 Finding the Correspondence of Caregiver's Vowel Categories Based on Unconsious Anchoring in Maternal Imitation , 2007 .

[10] Miguel Vaz,et al. Speech imitation with a child’s voice: addressing the correspondence problem , 2009 .

[11] G. Westermann,et al. A new model of sensorimotor coupling in the development of speech , 2004, Brain and Language.

[12] B. Wrede,et al. A self-referential childlike model to acquire phones, syllables and words from acoustic speech , 2008, 2008 7th IEEE International Conference on Development and Learning.

[13] Brian Scassellati,et al. Audio Speech Segmentation Without Language-Specific Knowledge , 2006 .

[14] E. Newport,et al. Computation of Conditional Probability Statistics by 8-Month-Old Infants , 1998 .