Unsupervised Timeline Generation for Wikipedia History Articles

This paper presents a generic approach to content selection for creating timelines from individual history articles for which no external information about the same topic is available. This scenario is in contrast to existing works on timeline generation, which require the presence of a large corpus of news articles. To identify salient events in a given history article, we exploit lexical cues about the article’s subject area, as well as time expressions that are syntactically attached to an event word. We also test different methods of ensuring timeline coverage of the entire historical time span described. Our best-performing method outperforms a new unsupervised baseline and an improved version of an existing supervised approach. We see our work as a step towards more semantically motivated approaches to single-document summarisation.

[1]  James Pustejovsky,et al.  SemEval-2015 Task 5: QA TempEval - Evaluating Temporal Information Understanding with Question Answering , 2015, *SEMEVAL.

[2]  Simone Teufel,et al.  A Methodology for Evaluating Timeline Generation Algorithms based on Deep Semantic Units , 2015, ACL.

[3]  James Pustejovsky,et al.  SemEval-2013 Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations , 2013, *SEMEVAL.

[4]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[5]  Eneko Agirre,et al.  SemEval-2015 Task 4: TimeLine: Cross-Document Event Ordering , 2015, *SEMEVAL.

[6]  James R. Curran,et al.  Wide-Coverage Efficient Statistical Parsing with CCG and Log-Linear Models , 2007, Computational Linguistics.

[7]  James Allan,et al.  TimeMine (demonstration session): visualizing automatically constructed timelines , 2000, SIGIR '00.

[8]  Jeremy Witmer,et al.  Extracting and Displaying Temporal and Geospatial Entities from Articles on Historical Events , 2014, Comput. J..

[9]  Tommaso Caselli,et al.  SemEval-2010 Task 13: TempEval-2 , 2010, *SEMEVAL.

[10]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[11]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[12]  Katja Markert,et al.  Joint Graphical Models for Date Selection in Timeline Summarization , 2015, ACL.

[13]  James Pustejovsky,et al.  SemEval-2007 Task 15: TempEval Temporal Relation Identification , 2007, Fourth International Workshop on Semantic Evaluations (SemEval-2007).

[14]  Hai Leong Chieu,et al.  Query based event extraction along a timeline , 2004, SIGIR '04.

[15]  Marie-Francine Moens,et al.  Extracting Narrative Timelines as Temporal Dependency Structures , 2012, ACL.

[16]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[17]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[18]  Michael Gertz,et al.  Multilingual and cross-domain temporal tagging , 2012, Language Resources and Evaluation.

[19]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[20]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[21]  James Pustejovsky,et al.  TimeML: Robust Specification of Event and Temporal Expressions in Text , 2003, New Directions in Question Answering.