TimeMachine: Timeline Generation for Knowledge-Base Entities

We present a method called TIMEMACHINE to generate a timeline of events and relations for entities in a knowledge base. For example for an actor, such a timeline should show the most important professional and personal milestones and relationships such as works, awards, collaborations, and family relationships. We develop three orthogonal timeline quality criteria that an ideal timeline should satisfy: (1) it shows events that are relevant to the entity; (2) it shows events that are temporally diverse, so they distribute along the time axis, avoiding visual crowding and allowing for easy user interaction, such as zooming in and out; and (3) it shows events that are content diverse, so they contain many different types of events (e.g., for an actor, it should show movies and marriages and awards, not just movies). We present an algorithm to generate such timelines for a given time period and screen size, based on submodular optimization and web-co-occurrence statistics with provable performance guarantees. A series of user studies using Mechanical Turk shows that all three quality criteria are crucial to produce quality timelines and that our algorithm significantly outperforms various baseline and state-of-the-art methods.

[1]  Jiawei Han,et al.  Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions , 2015, IEEE Transactions on Knowledge and Data Engineering.

[2]  Nicoleta Preda,et al.  Semantic Culturomics (vision paper) , 2014, Proc. VLDB Endow..

[3]  David Maxwell Chickering,et al.  Here or There , 2008, ECIR.

[4]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[5]  Alexander J. Smola,et al.  Fair and balanced: learning to present news stories , 2012, WSDM '12.

[6]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[7]  Tuan Tran,et al.  WikipEvent: Leveraging Wikipedia Edit History for Event Detection , 2014, WISE.

[8]  James Allan,et al.  Automatic generation of overview timelines , 2000, SIGIR '00.

[9]  Gerhard Weikum,et al.  Entity timelines: visual analytics and named entity evolution , 2011, CIKM '11.

[10]  Heng Ji,et al.  Tackling representation, annotation and classification challenges for temporal knowledge base population , 2014, Knowledge and Information Systems.

[11]  Jan Vondrák,et al.  Maximizing a Monotone Submodular Function Subject to a Matroid Constraint , 2011, SIAM J. Comput..

[12]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[13]  Hui Lin,et al.  Learning Mixtures of Submodular Shells with Application to Document Summarization , 2012, UAI.

[14]  Anirban Dasgupta,et al.  Summarization Through Submodularity and Dispersion , 2013, ACL.

[15]  References , 1971 .

[16]  Katz Estimation of probabilities from spase data for the language model component of a speech recognitizer , 1987 .

[17]  Rui Yan,et al.  Timeline generation with social attention , 2013, SIGIR.

[18]  James Allan,et al.  Temporal summaries of new topics , 2001, SIGIR '01.

[19]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[20]  David Bamman,et al.  Unsupervised Discovery of Biographical Structure from Text , 2014, TACL.

[21]  M. L. Fisher,et al.  An analysis of approximations for maximizing submodular set functions—I , 1978, Math. Program..

[22]  Maarten de Rijke,et al.  yourHistory - Semantic Linking for a Personalized Timeline of Historic Events , 2013, Veni@OKCon.

[23]  Dan Roth,et al.  Joint Inference for Event Timeline Construction , 2012, EMNLP.

[24]  Andreas Krause,et al.  Submodular Function Maximization , 2014, Tractability.

[25]  Dafna Shahaf,et al.  Information cartography: creating zoomable, large-scale maps of information , 2013, KDD.

[26]  Gerhard Weikum,et al.  CATE: context-aware timeline for entity illustration , 2011, WWW.

[27]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[28]  Lei Zhang,et al.  Mining text snippets for images on the web , 2014, KDD.

[29]  Claire Cardie,et al.  Timeline generation: tracking individuals on twitter , 2013, WWW.

[30]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[31]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[32]  Andreas Krause,et al.  Cost-effective outbreak detection in networks , 2007, KDD '07.

[33]  Fabian M. Suchanek,et al.  Mining history with Le Monde , 2013, AKBC '13.

[34]  Daniel S. Weld,et al.  Temporal Information Extraction , 2010, AAAI.

[35]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[36]  Dafna Shahaf,et al.  Metro maps of science , 2012, KDD.

[37]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[38]  David Maxwell Chickering,et al.  Here or there: preference judgments for relevance , 2008 .

[39]  Gerhard Weikum,et al.  Timely YAGO: harvesting, querying, and visualizing temporal knowledge from Wikipedia , 2010, EDBT '10.

[40]  Gerhard Weikum,et al.  Longitudinal Analytics on Web Archive Data: It's About Time! , 2011, CIDR.

[41]  Gerhard Weikum,et al.  WWW 2007 / Track: Semantic Web Session: Ontologies ABSTRACT YAGO: A Core of Semantic Knowledge , 2022 .

[42]  Michel Minoux,et al.  Accelerated greedy algorithms for maximizing submodular set functions , 1978 .

[43]  Thorsten Joachims,et al.  Temporal corpus summarization using submodular word coverage , 2012, CIKM '12.