Adaptive Time-to-Live Strategies for Query Result Caching in Web Search Engines

An important research problem that has recently started to receive attention is the freshness issue in search engine result caches. In the current techniques in literature, the cached search result pages are associated with a fixed time-to-live (TTL) value in order to bound the staleness of search results presented to the users, potentially as part of a more complex cache refresh or invalidation mechanism. In this paper, we propose techniques where the TTL values are set in an adaptive manner, on a per-query basis. Our results show that the proposed techniques reduce the fraction of stale results served by the cache and also decrease the fraction of redundant query evaluations on the search engine backend compared to a strategy using a fixed TTL value for all queries.

[1]  Wojciech Rytter,et al.  Extracting Powers and Periods in a String from Its Runs Structure , 2010, SPIRE.

[2]  Hans Friedrich Witschel,et al.  Admission Policies for Caches of Search Engine Results , 2007, SPIRE.

[3]  Ronny Lempel,et al.  Caching for Realtime Search , 2011, ECIR.

[4]  Aristides Gionis,et al.  The impact of caching on search engines , 2007, SIGIR.

[5]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[6]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[7]  Torsten Suel,et al.  Improved techniques for result caching in web search engines , 2009, WWW '09.

[8]  Abdur Chowdhury,et al.  A picture of search , 2006, InfoScale '06.

[9]  Berkant Barla Cambazoglu,et al.  A refreshing perspective of search engine caching , 2010, WWW '10.

[10]  Özgür Ulusoy,et al.  Exploiting navigational queries for result presentation and caching in Web search engines , 2011, J. Assoc. Inf. Sci. Technol..

[11]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[12]  Krithi Ramamritham,et al.  Maintaining temporal coherency of virtual data warehouses , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).

[13]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[14]  Yiqun Liu,et al.  Automatic Query Type Identification Based on Click Through Information , 2006, AIRS.

[15]  Ricardo Baeza-Yates,et al.  ResIn: a combination of results caching and index pruning for high-performance web search engines , 2008, SIGIR '08.

[16]  Jinyoung Kim,et al.  An Analysis of Time-Instability in Web Search Results , 2011, ECIR.

[17]  Tie-Yan Liu,et al.  Information Retrieval Technology , 2014, Lecture Notes in Computer Science.

[18]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..

[19]  Shlomo Moran,et al.  Predictive caching and prefetching of query results in search engines , 2003, WWW '03.

[20]  Özgür Ulusoy,et al.  Timestamp-based result cache invalidation for web search engines , 2011, SIGIR.