A framework for the automatic inference of stochastic turn-taking styles

Conversant-independent stochastic turntaking (STT) models generally benefit from additional training data. However, conversants are patently not identical in turn-taking style: recent research has shown that conversant-specific models can be used to refractively detect some conversants in unseen conversations. The current work explores an unsupervised framework for studying turn-taking style variability. First, within a verification framework using an information-theoretic model distance, sides cluster by conversant more often than not. Second, multi-dimensional scaling onto low-dimensional subspaces appears capable of preserving distance. These observations suggest that, for many speakers, turn-taking style as characterized by time-independent STT models is a stable attribute, which may be correlated with other stable speaker attributes such as personality. The exploratory techniques presented stand to benefit speaker diarization technology, dialogue agent design, and automated psychological diagnosis.

[1]  J. M. Dabbs,et al.  Dimensions of Group Process: Amount and Structure of Vocal Interaction , 1987 .

[2]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[3]  Paul T. Brady,et al.  A model for generating on-off speech patterns in two-way conversation , 1969 .

[4]  Tanja Schultz,et al.  Modeling Vocal Interaction for Text-Independent Participant Characterization in Multi-Party Conversation , 2008, SIGDIAL Workshop.

[5]  Patrick Mair,et al.  Multidimensional Scaling Using Majorization: SMACOF in R , 2008 .

[6]  Kornel Laskowski,et al.  Modeling Norms of Turn-Taking in Multi-Party Conversation , 2010, ACL.

[7]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[8]  Mattias Heldner,et al.  A single-port non-parametric model of turn-taking in multi-party conversation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Andreas Stolcke,et al.  The ICSI Meeting Corpus , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[10]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[11]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[12]  Kornel Laskowski,et al.  Exploiting loudness dynamics in stochastic models of turn-taking , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[13]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.

[14]  Daniel Jurafsky,et al.  Extracting Social Meaning: Identifying Interactional Style in Spoken Conversation , 2009, NAACL.

[15]  Haralabos C. Papadopoulos,et al.  " A Model for Generating On-Off Speech Patterns in Two-Way Conversation , 2017 .

[16]  Thomas P. Wilson,et al.  Models of Turn Taking in Conversational Interaction , 1984 .

[17]  Mattias Heldner,et al.  Incremental Learning and Forgetting in Stochastic Turn-Taking Models , 2011, INTERSPEECH.

[18]  Ran El-Yaniv,et al.  Agnostic Classification of Markovian Sequences , 1997, NIPS.

[19]  Kornel Laskowski,et al.  On the conversant-specificity of stochastic turn-taking models , 2014, INTERSPEECH.

[20]  Allen L. Gorin,et al.  Social correlates of turn-taking style , 2011, Comput. Speech Lang..

[21]  J. JAFFE,et al.  Markovian Models of Dialogic Time Patterns , 1967, Nature.