Character-based kernels for novelistic plot structure

Better representations of plot structure could greatly improve computational methods for summarizing and generating stories. Current representations lack abstraction, focusing too closely on events. We present a kernel for comparing novelistic plots at a higher level, in terms of the cast of characters they depict and the social relationships between them. Our kernel compares the characters of different novels to one another by measuring their frequency of occurrence over time and the descriptive and emotional language associated with them. Given a corpus of 19th-century novels as training data, our method can accurately distinguish held-out novels in their original form from artificially disordered or reversed surrogates, demonstrating its ability to robustly represent important aspects of plot structure.

[1]  Tejashri Inadarchand Jain,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2010 .

[2]  Eugene Charniak,et al.  Unsupervised Learning of Name Structure From Coreference Data , 2001, NAACL.

[3]  Lise Getoor,et al.  Relational clustering for multi-type entity resolution , 2005, MRDM '05.

[4]  Chris Mellish,et al.  Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04) , 2004, ACL 2004.

[5]  Walt Detmar Meurers,et al.  Emotional Perception of Fairy Tales: Achieving Agreement in Emotion Annotation of Text , 2010, HLT-NAACL 2010.

[6]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[7]  Kathleen McKeown,et al.  Building a Bank of Semantically Encoded Narratives , 2010, LREC.

[8]  Cecilia Ovesdotter Alm,et al.  Emotions from Text: Machine Learning for Text-based Emotion Prediction , 2005, HLT.

[9]  Cecilia Ovesdotter Alm,et al.  Emotional Sequencing and Development in Fairy Tales , 2005, ACII.

[10]  Anna Kazantseva,et al.  Summarizing Short Stories , 2010, CL.

[11]  Dragomir R. Radev,et al.  Coherent Citation-Based Summarization of Scientific Papers , 2011, ACL.

[12]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[13]  Patrick Pantel,et al.  Induction of semantic classes from natural language text , 2001, KDD '01.

[14]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[15]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[16]  Saif Mohammad,et al.  From Once Upon a Time to Happily Ever After: Tracking Emotions in Novels and Fairy Tales , 2011, LaTeCH@ACL.

[17]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Approach to Identifying Sentence Boundaries , 1997, ANLP.

[18]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[19]  Ellen Riloff,et al.  Automatically Producing Plot Unit Representations for Narrative Text , 2010, EMNLP.

[20]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Schemas and their Participants , 2009, ACL.

[21]  Mirella Lapata,et al.  Plot Induction and Evolutionary Search for Story Generation , 2010, ACL.

[22]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[23]  Matt Post,et al.  Judging Grammaticality with Tree Substitution Grammar Derivations , 2011, ACL.

[24]  Kathleen McKeown,et al.  Extracting Social Networks from Literary Fiction , 2010, ACL.

[25]  David R. Karger,et al.  Global Models of Document Structure using Latent Permutations , 2009, NAACL.

[26]  Chris Mellish,et al.  Evaluating Centering-Based Metrics of Coherence , 2004, ACL.

[27]  Mark Dredze,et al.  Entity Disambiguation for Knowledge Base Population , 2010, COLING.