Semi-Supervised Semantic Tagging of Conversational Understanding using Markov Topic Regression

Finding concepts in natural language utterances is a challenging task, especially given the scarcity of labeled data for learning semantic ambiguity. Furthermore, data mismatch issues, which arise when the expected test (target) data does not exactly match the training data, aggravate this scarcity problem. To deal with these issues, we describe an efficient semisupervised learning (SSL) approach which has two components: (i) Markov Topic Regression is a new probabilistic model to cluster words into semantic tags (concepts). It can efficiently handle semantic ambiguity by extending standard topic models with two new features. First, it encodes word n-gram features from labeled source and unlabeled target data. Second, by going beyond a bag-of-words approach, it takes into account the inherent sequential nature of utterances to learn semantic classes based on context. (ii) Retrospective Learner is a new learning technique that adapts to the unlabeled target data. Our new SSL approach improves semantic tagging performance by 3% absolute over the baseline models, and also compares favorably on semi-supervised syntactic tagging.

[1]  Josef van Genabith,et al.  QuestionBank: Creating a Corpus of Parse-Annotated Questions , 2006, ACL.

[2]  Slav Petrov,et al.  Efficient Graph-Based Semi-Supervised Learning of Structured Tagging Models , 2010, EMNLP.

[3]  Xiao Li,et al.  Understanding the Semantic Structure of Noun Phrase Queries , 2010, ACL.

[4]  Sameer Singh,et al.  Minimally-Supervised Extraction of Entities from Text Advertisements , 2010, NAACL.

[5]  Patrick Pantel,et al.  Clustering by committee , 2003 .

[6]  Philip Resnik,et al.  SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations , 2012, ACL.

[7]  Alexander J. Smola,et al.  Word Features for Latent Dirichlet Allocation , 2010, NIPS.

[8]  Michal Rosen-Zvi,et al.  Hidden Topic Markov Models , 2007, AISTATS.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[11]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[12]  Joseph Reisinger,et al.  Cross-Cutting Models of Lexical Semantics , 2011, EMNLP 2011.

[13]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[14]  Gokhan Tur,et al.  Spoken Language Understanding: Systems for Extracting Semantic Information from Speech , 2011 .

[15]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[16]  Daumé,et al.  Frustratingly Easy Semi-Supervised Domain Adaptation , 2010 .

[17]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[18]  Caroline Sporleder,et al.  Topic Models for Word Sense Disambiguation and Token-Based Idiom Detection , 2010, ACL.

[19]  Jason Baldridge,et al.  Crouching Dirichlet, Hidden Markov Model: Unsupervised POS Tagging with Context Local Tag Generation , 2010, EMNLP.

[20]  Xiao Li,et al.  Lexicon modeling for query understanding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Xiao Li,et al.  Extracting structured information from user queries with semi-supervised conditional random fields , 2009, SIGIR.

[22]  Xiaojin Zhu,et al.  A Topic Model for Word Sense Disambiguation , 2007, EMNLP.

[23]  Xian Wu,et al.  Domain Adaptation with Latent Semantic Association for Named Entity Recognition , 2009, NAACL.

[24]  Nathan D. Ratliff,et al.  History Dependent Domain Adaptation , 2011 .

[25]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[26]  Xiao Li,et al.  Semi-supervised learning of semantic classes for query understanding: from the web and for the web , 2009, CIKM.