Words Temporality for Improving Query Expansion

There is a lot of recent work aimed at improving the effectiveness in Information Retrieval results based on temporal information extracted from texts. Some works use all dates but others use only document creation or modification timestamps. However, no previous work explicitly focuses on the use of dates within in the document content to establish temporal relationships between words in the document. This work estimates these relationships through a temporal segmentation of the texts, exploring them to expand queries. It was achieved very promising results (13% improvement in Precision@15), especially for temporal aware queries. To the best of our knowledge, this is the first work using temporal text segmentation to improve retrieval results.

[1]  Ricardo Baeza-Yates,et al.  Effectiveness of Temporal Snippets , 2009 .

[2]  Massih-Reza Amini,et al.  Unsupervised Learning with Term Clustering for Thematic Segmentation of Texts , 2004, RIAO.

[3]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[4]  Michael Gertz,et al.  Temporal Information Retrieval: Challenges and Opportunities , 2011, TWAW.

[5]  Clive Loughlin Researching the future , 2009 .

[6]  Thiago Alexandre Salgueiro Pardo,et al.  Computational Processing of the Portuguese Language - 11th International Conference, PROPOR 2014, São Carlos/SP, Brazil, October 6-8, 2014. Proceedings , 2014, Lecture Notes in Computer Science.

[7]  Giorgio Gambosi,et al.  On relevance, time and query expansion , 2011, CIKM '11.

[8]  Diana Santos,et al.  CHAVE: Topics and Questions on the Portuguese Participation in CLEF , 2004, CLEF.

[9]  Cristina Ribeiro,et al.  Use of Temporal Expressions in Web Search , 2008, ECIR.

[10]  Joemon M. Jose,et al.  Exploring term temporality for pseudo-relevance feedback , 2011, SIGIR.

[11]  Joaquim Macedo,et al.  It Is the Time for Portuguese Texts! , 2012, PROPOR.

[12]  Regina Barzilay,et al.  Finding Temporal Order in Discharge Summaries , 2006, AMIA.

[13]  Hector Garcia-Molina,et al.  Synchronizing a database to improve freshness , 2000, SIGMOD 2000.

[14]  Gianni Amati,et al.  Probability models for information retrieval based on divergence from randomness , 2003 .

[15]  Pawel Jan Kalczynski,et al.  Temporal Document Retrieval Model for business news archives , 2005, Inf. Process. Manag..