Joint, Incremental Disfluency Detection and Utterance Segmentation from Speech

We present the joint task of incremental disfluency detection and utterance segmentation and a simple deep learning system which performs it on transcripts and ASR results. We show how the constraints of the two tasks interact. Our joint-task system outperforms the equivalent individual task systems, provides competitive results and is suitable for future use in conversation agents in the psychiatric domain.

[1]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[2]  Model Adaptation for Sentence Unit Segmentation from Speech Model Adaptation for Sentence Unit Segmentation from Speech , 2006 .

[3]  Eugene Charniak,et al.  A TAG-based noisy-channel model of speech repairs , 2004, ACL.

[4]  Kallirroi Georgila Using Integer Linear Programming for Detecting Speech Disfluencies , 2009, HLT-NAACL.

[5]  Marie-Francine Moens,et al.  A survey on the application of recurrent neural networks to statistical language modeling , 2015, Comput. Speech Lang..

[6]  Geoffrey Zweig,et al.  Joint semantic utterance classification and slot filling with recursive neural networks , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[7]  David Schlangen,et al.  Evaluation and Optimisation of Incremental Processors , 2011, Dialogue Discourse.

[8]  Julian Hough,et al.  Recurrent neural networks for incremental disfluency detection , 2015, INTERSPEECH.

[9]  David Schlangen,et al.  From reaction to prediction: experiments with computational models of turn-taking , 2006, INTERSPEECH.

[10]  W. Levelt,et al.  Monitoring and self-repair in speech , 1983, Cognition.

[11]  P. Healey,et al.  Shared understanding in psychiatrist-patient communication: association with treatment adherence in schizophrenia. , 2013, Patient education and counseling.

[12]  Gökhan Tür,et al.  Prosody-based automatic segmentation of speech into sentences and topics , 2000, Speech Commun..

[13]  Mark Johnson,et al.  Detecting Speech Repairs Incrementally Using a Noisy Channel Approach , 2010, COLING.

[14]  Louis-Philippe Morency,et al.  It's only a computer: Virtual humans increase willingness to disclose , 2014, Comput. Hum. Behav..

[15]  Elisabeth Schriberg,et al.  Preliminaries to a Theory of Speech Disfluencies , 1994 .

[16]  David Schlangen,et al.  Towards Incremental End-of-Utterance Detection in Dialogue Systems , 2008, COLING.

[17]  Jonas Kuhn,et al.  How to Train Dependency Parsers with Inexact Search for Joint Sentence Boundary Detection and Parsing of Entire Documents , 2016, ACL.

[18]  Mark Johnson,et al.  Joint Incremental Disfluency Detection and Dependency Parsing , 2014, TACL.

[19]  Julian Hough,et al.  Recognising Conversational Speech: What an Incremental ASR Should Do for a Dialogue System and How to Get There , 2016, IWSDS.

[20]  Yang Liu,et al.  Disfluency Detection Using Multi-step Stacked Learning , 2013, NAACL.

[21]  Kallirroi Georgila,et al.  Verbal indicators of psychological distress in interactive dialogue with a virtual human , 2013, SIGDIAL Conference.

[22]  Gabriel Skantze,et al.  A General, Abstract Model of Incremental Dialogue Processing , 2009, EACL.

[23]  Haizhou Li,et al.  A deep neural network approach for sentence boundary detection in broadcast news , 2014, INTERSPEECH.

[24]  Matthew Purver,et al.  Strongly Incremental Repair Detection , 2014, EMNLP.

[25]  Antoine Raux Flexible Turn-Taking for Spoken Dialogue Systems , 2006 .

[26]  Gökhan Tür,et al.  MODEL ADAPTATION FOR SENTENCE SEGMENTATION FROM SPEECH , 2006, 2006 IEEE Spoken Language Technology Workshop.

[27]  Matthew Purver,et al.  Helping the medicine go down : Repair and adherence in patient-clinician dialogues , 2012 .

[28]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[29]  Helena Moniz,et al.  DISFLUENCY DETECTION ACROSS DOMAINS , 2015 .

[30]  David DeVault,et al.  Toward incremental dialogue act segmentation in fast-paced interactive dialogue systems , 2016, SIGDIAL Conference.

[31]  Matthew Purver,et al.  Helping, I Mean Assessing Psychiatric Communication: An Application of Incremental Self-Repair Detection , 2014 .

[32]  Carlos D. Martínez-Hinarejos,et al.  Unsegmented Dialogue Act Annotation and Decoding With N-Gram Transducers , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.