Speech dominoes and phonetic convergence

Interlocutors are known to mutually adapt during conversation. Recent studies have questioned the adaptation of phonological representations and kinematics of phonetic variables such as loudness, speech rate or fundamental frequency. Results are often contradictory and the effectiveness of phonetic convergence during conversation is still an open issue. This paper describes an original experimental paradigm – a game played in primary schools known as verbal dominoes - that enables us to collect several hundreds of syllables uttered by both speakers in different conditions: alone, in ambient speech or in full interaction. Speech recognition techniques are then applied to globally characterize phonetic convergence if any. We hypothesize here that convergence of phonetic representations such as vocalic dispersions is not immediate especially when considering common words of the target language.

[1]  Ning Wang,et al.  Creating Rapport with Virtual Agents , 2007, IVA.

[2]  Maurizio Gentilucci,et al.  Imitation during phoneme production , 2007, Neuropsychologia.

[3]  Stefan Kopp,et al.  Social resonance and embodied coordination in face-to-face conversation with artificial interlocutors , 2010, Speech Commun..

[4]  Véronique Delvaux,et al.  The Influence of Ambient Speech on Adult Speech Productions through Unintentional Imitation , 2007, Phonetica.

[5]  Noël Nguyen,et al.  Automatic recognition of regional phonological variation in conversational interaction , 2010, Speech Commun..

[6]  A. Jonker Origins of the modern mind. Three stages in the evolution of culture and cognition , 1998 .

[7]  Jean-Luc Schwartz,et al.  Invariance and variability in the production of the height feature in French vowels , 2008, Speech Commun..

[8]  Charlie Cullen,et al.  Towards measuring continuous acoustic feature convergence in unconstrained spoken dialogues , 2008, INTERSPEECH.

[9]  Robert N. St. Clair,et al.  Language and social psychology , 1981 .

[10]  H. H. Clark,et al.  Conceptual pacts and lexical choice in conversation. , 1996, Journal of experimental psychology. Learning, memory, and cognition.

[11]  Mattias Heldner,et al.  Pause and gap length in face-to-face interaction , 2009, INTERSPEECH.

[12]  T. Chartrand,et al.  The Chameleon Effect as Social Glue: Evidence for the Evolutionary Significance of Nonconscious Mimicry , 2003 .

[13]  Steve Young,et al.  The HTK book , 1995 .

[14]  Jennifer S. Pardo,et al.  On phonetic convergence during conversational interaction. , 2006, The Journal of the Acoustical Society of America.

[15]  S. Brennan,et al.  Addressees' needs influence speakers' early syntactic choices , 2002, Psychonomic bulletin & review.

[16]  J. Allwood BODILY COMMUNICATION DIMENSIONS OF EXPRESSION AND CONTENT , 2002 .

[17]  Elizabeth Zoltan-Ford,et al.  How to Get People to Say and Type What Computers Can Understand , 1991, Int. J. Man Mach. Stud..

[18]  Alexandra A. Cleland,et al.  Activation of Syntactic Information During Language Production , 2000, Journal of psycholinguistic research.

[19]  S. W. Gregory Social psychological implications of voice frequency correlations: analyzing conversation partner adaptation by computer , 1986 .