Temporal Information Retrieval Revisited: A Focused Study on the Web

Temporal information retrieval has been an active area of research for a number of years. Most existing works focus on the utility of temporal information in specific types of corpora (such as news archives), specific types of retrieval approaches, and, specific applications that may benefit from temporal information (such as timeline summarization). Underrepresented in existing works are studies that investigate the impact of temporal information analyses on the Web and Web documents. In this paper we (i) describe the research gaps we identified around Web-based temporal information analysis, and, (ii) present an overview of our first results and observations when studying sub-document timestamping on the Web.

[1]  Ulf Leser,et al.  Extracting and aggregating temporal events from text , 2014, WWW.

[2]  Nathanael Chambers,et al.  Labeling Documents with Timestamps: Learning from their Time Expressions , 2012, ACL.

[3]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..

[4]  Zhifang Sui,et al.  Event-Based Time Label Propagation for Automatic Dating of News Articles , 2013, EMNLP.

[5]  Christopher Olston,et al.  What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.

[6]  Matthew Lease,et al.  Supervised language modeling for temporal resolution of texts , 2011, CIKM '11.

[7]  Djoerd Hiemstra,et al.  Temporal Language Models for the Disclosure of Historical Text , 2005 .

[8]  Yue Zhao,et al.  Sub-document Timestamping of Web Documents , 2015, SIGIR.

[9]  Jure Leskovec,et al.  Modeling Information Diffusion in Implicit Networks , 2010, 2010 IEEE International Conference on Data Mining.

[10]  Adam Jatowt,et al.  Estimating document focus time , 2013, CIKM.

[11]  Thomas G. Dietterich Machine Learning for Sequential Data: A Review , 2002, SSPR/SPR.

[12]  Ricardo Baeza-Yates,et al.  Clustering and exploring search results using timeline constructions , 2009, CIKM.

[13]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2003, WWW '03.

[14]  Kjetil Nørvåg,et al.  Using Temporal Language Models for Document Dating , 2009, ECML/PKDD.

[15]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[16]  W. Bruce Croft,et al.  Time-based language models , 2003, CIKM '03.

[17]  Gerhard Weikum,et al.  A Language Modeling Approach for Temporal Information Needs , 2010, ECIR.