Adapting dependency parsing to spontaneous speech for open domain spoken language understanding

Parsing human-human conversations consists in automatically enriching text transcription with semantic structure information. We use in this paper a FrameNet-based approach to semantics that, without needing a full semantic parse of a message, goes further than a simple flat translation of a message into basic concepts. FrameNet-based semantic parsing may follow a syntactic parsing step, however spoken conversations in customer service telephone call centers present very specific characteristics such as non-canonical language, noisy messages (disfluencies, repetitions, truncated words or automatic speech transcription errors) and the presence of superfluous information. For syntactic parsing the traditional view based on context-free grammars is not suitable for processing non-canonical text. New approaches to parsing based on dependency structures and discriminative machine learning techniques are more adapted to process spontaneous speech for two main reasons: (a) they need less training data and (b) the annotation with syntactic dependencies of conversation transcripts is simpler than with syntactic constituents. Another advantage is that partial annotation can be performed. This paper presents the adaptation of a syntactic dependency parser to process very spontaneous speech recorded in a callcentre environment. This parser is used in order to produce FrameNet candidates for characterizing conversations between an operator and a caller.

[1]  Frédéric Béchet,et al.  Robust dependency parsing for spoken language understanding of spontaneous speech , 2009, INTERSPEECH.

[2]  Koby Crammer,et al.  Online Large-Margin Training of Dependency Parsers , 2005, ACL.

[3]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[4]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[5]  Alexander I. Rudnicky,et al.  Unsupervised induction and filling of semantic slots for spoken dialogue systems using frame-semantic parsing , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[6]  Alexandra Kinyon,et al.  Building a Treebank for French , 2000, LREC.

[7]  Noah A. Smith,et al.  Frame-Semantic Parsing , 2014, CL.

[8]  Alessandro Moschitti,et al.  Shallow Semantic Parsing for Spoken Language Understanding , 2009, NAACL.

[9]  Frédéric Béchet,et al.  Syntactic annotation of spontaneous speech: application to call-center conversation data , 2012, LREC.

[10]  Brian Roark,et al.  Discriminative Syntactic Language Modeling for Speech Recognition , 2005, ACL.

[11]  Daniel Jurafsky,et al.  Automatic Labeling of Semantic Roles , 2002, CL.

[12]  Dan Klein,et al.  Learning and Inference for Hierarchically Split PCFGs , 2007, AAAI.

[13]  Frédéric Béchet,et al.  MACAON An NLP Tool Suite for Processing Word Lattices , 2011, ACL.

[14]  Frédéric Béchet,et al.  DECODA: a call-centre human-human spoken conversation corpus , 2012, LREC.

[15]  Namhee Kwon,et al.  Maximum Entropy Models for FrameNet Classification , 2003, EMNLP.

[16]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[17]  Bhiksha Raj,et al.  Discriminatively trained dependency language modeling for conversational speech recognition , 2013, INTERSPEECH.

[18]  Joakim Nivre,et al.  Characterizing the Errors of Data-Driven Dependency Parsing Models , 2007, EMNLP.