Narrative Schema Stability in News Text

We investigate the stability of narrative schemas (Chambers and Jurafsky, 2009) automatically induced from a news corpus, representing recurring narratives in a corpus. If such techniques produce meaningful results, we should expect that small changes to the corpus will result in only small changes to the induced schemas. We describe experiments involving successive ablation of a corpus and cross-validation at each stage of ablation, on schemas generated by three different techniques over a general news corpus and topically-specific subcorpora. We also develop a method for evaluating the similarity between sets of narrative schemas, and thus the stability of the schema induction algorithms. This stability analysis affirms the heterogeneous/homogeneous document category hypothesis first presented in Simonson and Davis (2016), whose technique is problematically limited. Additionally, increased ablation leads to increasing stability, so the smaller the remaining corpus, the more stable schema generation appears to be. We surmise that as a corpus grows larger, novel and more varied narratives continue to appear and stability declines, though at some point this decline levels off as new additions to the corpus consist essentially of “more of the same.”

[1]  Nathanael Chambers,et al.  A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories , 2016, ArXiv.

[2]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[3]  D. Simonson Investigations of the Properties of Narrative Schemas , 2018 .

[4]  Oren Etzioni,et al.  Generating Coherent Event Schemas at Scale , 2013, EMNLP.

[5]  Raymond J. Mooney,et al.  Statistical Script Learning with Multi-Argument Events , 2014, EACL.

[6]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[7]  Roman Yangarber,et al.  Counter-Training in Discovery of Semantic Patterns , 2003, ACL.

[8]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Schemas and their Participants , 2009, ACL.

[9]  Marie-Francine Moens,et al.  Skip N-grams and Ranking Functions for Predicting Script Events , 2012, EACL.

[10]  Raymond J. Mooney,et al.  Learning Statistical Scripts with LSTM Recurrent Neural Networks , 2016, AAAI.

[11]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[12]  Anthony R. Davis,et al.  Interactions between Narrative Schemas and Document Categories , 2015 .

[13]  Katrin Erk,et al.  Implicit Argument Prediction with Event Knowledge , 2018, NAACL.

[14]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[15]  Heeyoung Lee,et al.  Deterministic Coreference Resolution Based on Entity-Centric, Precision-Ranked Rules , 2013, CL.

[16]  Rada Mihalcea,et al.  Measuring the Semantic Similarity of Texts , 2005, EMSEE@ACL.

[17]  Gerald DeJong,et al.  Learning Schemata for Natural Language Processing , 1985, IJCAI.

[18]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[19]  Francis Ferraro,et al.  Script Induction as Language Modeling , 2015, EMNLP.

[20]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[21]  Anthony R. Davis,et al.  NASTEA: Investigating Narrative Schemas through Annotated Entities , 2016 .