Auto-imputing radial basis functions for neural-network turn-taking models

A stochastic turn-taking (STT) model is a per-frame predictor of incipient speech activity. Its ability to make predictions at any instant in time makes it particularly well-suited to the analysis and synthesis of interactive conversation. At the current time, however, STT models are limited by their inability to accept features which may frequently be undefined. Rather than attempting to impute such features, this work proposes and evaluates a mechanism which implicitly conditions Gaussiandistributed features on Bernoulli-distributed indicator features, making prior imputation unnecessary. Experiments indicate that the proposed mechanisms achieve predictive parity with standard model structures, while at the same time offering more direct interpretability and the desired insensitivity to missing feature values.

[1]  Kornel Laskowski,et al.  Corpus-independent history compression for stochastic turn-taking models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[3]  Haralabos C. Papadopoulos,et al.  " A Model for Generating On-Off Speech Patterns in Two-Way Conversation , 2017 .

[4]  J. JAFFE,et al.  Markovian Models of Dialogic Time Patterns , 1967, Nature.

[5]  Kornel Laskowski,et al.  Exploiting loudness dynamics in stochastic models of turn-taking , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).

[6]  Joseph Picone,et al.  Resegmentation of SWITCHBOARD , 1998, ICSLP.

[7]  Thomas P. Wilson,et al.  Models of Turn Taking in Conversational Interaction , 1984 .

[8]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[9]  Martin D. Buhmann,et al.  Radial Basis Functions: Theory and Implementations: Preface , 2003 .

[10]  Mattias Heldner,et al.  A single-port non-parametric model of turn-taking in multi-party conversation , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[11]  Stef van Buuren,et al.  Flexible Imputation of Missing Data , 2012 .

[12]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[13]  Martin Fodslette Meiller A Scaled Conjugate Gradient Algorithm for Fast Supervised Learning , 1993 .

[14]  Mattias Heldner,et al.  Incremental Learning and Forgetting in Stochastic Turn-Taking Models , 2011, INTERSPEECH.

[15]  Yuan Qi,et al.  Predictive automatic relevance determination by expectation propagation , 2004, ICML.