Inducing Script Structure from Crowdsourced Event Descriptions via Semi-Supervised Clustering

We present a semi-supervised clustering approach to induce script structure from crowdsourced descriptions of event sequences by grouping event descriptions into paraphrase sets (representing event types) and inducing their temporal order. Our model exploits semantic and positional similarity and allows for flexible event order, thus overcoming the rigidity of previous approaches. We incorporate crowdsourced alignments as prior knowledge and show that exploiting a small number of alignments results in a substantial improvement in cluster quality over state-of-the-art models and provides an appropriate basis for the induction of temporal order. We also show a coverage study to demonstrate the scalability of our ap-

[1]  Richard Edward Cullingford,et al.  Script application: computer understanding of newspaper stories. , 1977 .

[2]  Samuel Fernando,et al.  A Semantic Similarity Approach to Paraphrase Detection , 2008 .

[3]  Marie-Francine Moens,et al.  Skip N-grams and Ranking Functions for Predicting Script Events , 2012, EACL.

[4]  Dietrich Klakow,et al.  Event participant modelling with neural networks , 2016, EMNLP.

[5]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[6]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[7]  Mark Steedman,et al.  Lexical Event Ordering with an Edge-Factored Model , 2015, NAACL.

[8]  Andrew S. Gordon Mining Commonsense Knowledge From Personal Stories in Internet Weblogs , 2010 .

[9]  Elahe Rahimtoroghi,et al.  Learning Fine-Grained Knowledge about Contingent Relations between Everyday Events , 2016, SIGDIAL Conference.

[10]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[11]  Ivan Titov,et al.  Inducing Neural Models of Script Knowledge , 2014, CoNLL.

[12]  Erik T. Mueller,et al.  Question Answering in Natural Language Narratives Using Symbolic Probabilistic Reasoning , 2012, FLAIRS Conference.

[13]  Stefan Thater,et al.  A Crowdsourced Database of Event Sequence Descriptions for the Acquisition of High-quality Script Knowledge , 2016, LREC.

[14]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Schemas and their Participants , 2009, ACL.

[15]  Nathanael Chambers,et al.  A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.

[16]  Breck Baldwin,et al.  Algorithms for Scoring Coreference Chains , 1998 .

[17]  Manfred Pinkal,et al.  Learning Script Knowledge with Web Experiments , 2010, ACL.

[18]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[19]  Erik T. Mueller,et al.  Understanding script-based stories using commonsense reasoning , 2004, Cognitive Systems Research.

[20]  Dan Klein,et al.  From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering , 2002, ICML.

[21]  Nathanael Chambers,et al.  Unsupervised Learning of Narrative Event Chains , 2008, ACL.

[22]  Vincent Ng,et al.  Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution , 2014, J. Artif. Intell. Res..

[23]  Yejin Choi,et al.  Learning Prototypical Event Structure from Photo Albums , 2016, ACL.

[24]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[25]  Julio Gonzalo,et al.  A comparison of extrinsic clustering evaluation metrics based on formal constraints , 2009, Information Retrieval.

[26]  Pushmeet Kohli,et al.  Story Cloze Evaluator: Vector Space Representation Evaluation by Predicting What Happens Next , 2016, RepEval@ACL.

[27]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[28]  Rakesh Gupta,et al.  Common Sense Data Acquisition for Indoor Mobile Robots , 2004, AAAI.

[29]  Raymond J. Mooney,et al.  Statistical Script Learning with Multi-Argument Events , 2014, EACL.

[30]  Ivan Titov,et al.  A Hierarchical Bayesian Model for Unsupervised Induction of Script Knowledge , 2014, EACL.