Two Discourse Driven Language Models for Semantics

Natural language understanding often requires deep semantic knowledge. Expanding on previous proposals, we suggest that some important aspects of semantic knowledge can be modeled as a language model if done at an appropriate level of abstraction. We develop two distinct models that capture semantic frame chains and discourse information while abstracting over the specific mentions of predicates and entities. For each model, we investigate four implementations: a "standard" N-gram language model and three discriminatively trained "neural" language models that generate embeddings for semantic frames. The quality of the semantic language models (SemLM) is evaluated both intrinsically, using perplexity and a narrative cloze test and extrinsically - we show that our SemLM helps improve performance on semantic natural language processing tasks such as co-reference resolution and discourse parsing.

[1]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[2]  Dan Roth,et al.  A Joint Framework for Coreference Resolution and Mention Head Detection , 2015, CoNLL.

[3]  Dan Roth,et al.  Part of Speech Tagging Using a Network of Linear Separators , 1998, ACL.

[4]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[5]  Stephen Clark,et al.  What Happens Next? Event Prediction Using a Compositional Neural Network Model , 2016, AAAI.

[6]  Yuchen Zhang,et al.  CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.

[7]  Nathanael Chambers,et al.  Jointly Combining Implicit Constraints Improves Temporal Ordering , 2008, EMNLP.

[8]  Ivan Titov,et al.  Inducing Neural Models of Script Knowledge , 2014, CoNLL.

[9]  Francis Ferraro,et al.  A Unified Bayesian Model of Scripts, Frames and Language , 2016, AAAI.

[10]  James Pustejovsky,et al.  Automatically Identifying the Arguments of Discourse Connectives , 2007, EMNLP.

[11]  Phil Blunsom,et al.  OxLM: A Neural Language Modelling Framework for Machine Translation , 2014, Prague Bull. Math. Linguistics.

[12]  Andrew McCallum,et al.  First-Order Probabilistic Models for Coreference Resolution , 2007, NAACL.

[13]  Cosmin Adrian Bejan Unsupervised Discovery of Event Scenarios from Texts , 2008, FLAIRS Conference.

[14]  Francis Ferraro,et al.  Script Induction as Language Modeling , 2015, EMNLP.

[15]  Hwee Tou Ng,et al.  The CoNLL-2015 Shared Task on Shallow Discourse Parsing , 2015, CoNLL.

[16]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[17]  Terry Winograd,et al.  Understanding natural language , 1974 .

[18]  Geoffrey E. Hinton,et al.  Three new graphical models for statistical language modelling , 2007, ICML '07.

[19]  Aapo Hyvärinen,et al.  Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.

[20]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[21]  David Bamman,et al.  Unsupervised Discovery of Biographical Structure from Text , 2014, TACL.

[22]  Raymond J. Mooney,et al.  Statistical Script Learning with Multi-Argument Events , 2014, EACL.

[23]  Marie-Francine Moens,et al.  Skip N-grams and Ranking Functions for Predicting Script Events , 2012, EACL.

[24]  Jason Weston,et al.  Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.

[25]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[26]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[27]  Ivan Titov,et al.  A Hierarchical Bayesian Model for Unsupervised Induction of Script Knowledge , 2014, EACL.

[28]  Raymond J. Mooney,et al.  Learning Statistical Scripts with LSTM Recurrent Neural Networks , 2016, AAAI.

[29]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[30]  Jackie Chi Kit Cheung,et al.  Probabilistic Frame Induction , 2013, NAACL.

[31]  Gerald DeJong,et al.  Learning Schemata for Natural Language Processing , 1985, IJCAI.

[32]  Nathanael Chambers,et al.  Event Schema Induction with a Probabilistic Entity-Driven Model , 2013, EMNLP.

[33]  Dan Roth,et al.  Semantic Role Labeling Via Integer Linear Programming Inference , 2004, COLING.

[34]  Yuji Matsumoto,et al.  Narrative Schema as World Knowledge for Coreference Resolution , 2011, CoNLL Shared Task.

[35]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[36]  Oren Etzioni,et al.  Generating Coherent Event Schemas at Scale , 2013, EMNLP.

[37]  Romaric Besançon,et al.  Generative Event Schema Induction with Entity Disambiguation , 2015, ACL.

[38]  Dan Roth,et al.  Solving Hard Coreference Problems , 2019, NAACL.

[39]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Schemas and their Participants , 2009, ACL.

[40]  Livio Robaldo,et al.  The Penn Discourse TreeBank 2.0. , 2008, LREC.

[41]  Ivan Titov,et al.  Learning Semantic Script Knowledge with Event Embeddings , 2014, ICLR.

[42]  Parisa Kordjamshidi,et al.  Improving a Pipeline Architecture for Shallow Discourse Parsing , 2015, CoNLL Shared Task.

[43]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[44]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[45]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[46]  Ivan Titov,et al.  Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework , 2014, NAACL.

[47]  Vincent Ng,et al.  Coreference Resolution with World Knowledge , 2011, ACL.