Learning to Identify Historical Figures for Timeline Creation from Wikipedia Articles

This paper addresses a central sub-task of timeline creation from historical Wikipedia articles: learning from text which of the person names in a textual article should appear in a timeline on the same topic. We first process hundreds of timelines written by human experts and related Wikipedia articles to construct a corpus that can be used to evaluate systems that create history timelines from text documents. We then use a set of features to train a classifier that predicts the most important person names, resulting in a clear improvement over a competitive baseline.

[1]  Daniel Hienert,et al.  Extraction of Historical Events from Wikipedia , 2012, KNOW@LOD.

[2]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[3]  Mirella Lapata,et al.  Discourse Chunking and its Application to Sentence Compression , 2005, HLT.

[4]  Martin Halvey,et al.  WWW '07: Proceedings of the 16th international conference on World Wide Web , 2007, WWW 2007.

[5]  Ricardo Baeza-Yates,et al.  Clustering and exploring search results using timeline constructions , 2009, CIKM.

[6]  Jeremy Witmer,et al.  Extracting and Displaying Temporal and Geospatial Entities from Articles on Historical Events , 2014, Comput. J..

[7]  Lora Aroyo,et al.  The Semantic Web: Research and Applications , 2009, Lecture Notes in Computer Science.

[8]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[9]  James Pustejovsky,et al.  SemEval-2013 Task 1: TempEval-3: Evaluating Time Expressions, Events, and Temporal Relations , 2013, *SEMEVAL.

[10]  Hai Leong Chieu,et al.  Query based event extraction along a timeline , 2004, SIGIR '04.

[11]  Marie-Francine Moens,et al.  Extracting Narrative Timelines as Temporal Dependency Structures , 2012, ACL.

[12]  Dunja Mladenic,et al.  Demo: HistoryViz - Visualizing Events and Relations Extracted from Wikipedia , 2009, ESWC.

[13]  Man Lung Yiu,et al.  Group-by skyline query processing in relational engines , 2009, CIKM.

[14]  Xavier Tannier,et al.  Ranking Multidocument Event Descriptions for Building Thematic Timelines , 2014, COLING.

[15]  Steven Bethard,et al.  A Synchronous Context Free Grammar for Time Normalization , 2013, EMNLP.

[16]  Yan Zhang,et al.  Evolutionary timeline summarization: a balanced optimization framework via iterative substitution , 2011, SIGIR.

[17]  James Pustejovsky,et al.  Evita: A Robust Event Recognizer For QA Systems , 2005, HLT.