Understanding temporal query dynamics

Web search is strongly influenced by time. The queries people issue change over time, with some queries occasionally spiking in popularity (e.g., earthquake) and others remaining relatively constant (e.g., youtube). The documents indexed by the search engine also change, with some documents always being about a particular query (e.g., the Wikipedia page on earthquakes is about the query earthquake) and others being about the query only at a particular point in time (e.g., the New York Times is only about earthquakes following a major seismic activity). The relationship between documents and queries can also change as people's intent changes (e.g., people sought different content for the query earthquake before the Haitian earthquake than they did after). In this paper, we explore how queries, their associated documents, and the query intent change over the course of 10 weeks by analyzing query log data, a daily Web crawl, and periodic human relevance judgments. We identify several interesting features by which changes to query popularity can be classified, and show that presence of these features, when accompanied by changes in result content, can be a good indicator of change in query intent.

[1]  Peiling Wang,et al.  Mining longitudinal web queries: Trends and patterns , 2003, J. Assoc. Inf. Sci. Technol..

[2]  Susan T. Dumais,et al.  Resonance on the web: web dynamics and revisitation patterns , 2009, CHI.

[3]  Gilad Mishne,et al.  Towards recency ranking in web search , 2010, WSDM '10.

[4]  Fernando Diaz,et al.  Using temporal profiles of queries for precision prediction , 2004, SIGIR '04.

[5]  W. Bruce Croft,et al.  Time-based language models , 2003, CIKM '03.

[6]  Fernando Diaz,et al.  Integration of news content into web results , 2009, WSDM '09.

[7]  Jian-Yun Nie,et al.  Search result re-ranking by feedback control adjustment for time-sensitive query , 2009, HLT-NAACL 2009.

[8]  Brian N. Bershad,et al.  Why we search: visualizing and predicting user behavior , 2007, WWW '07.

[9]  Susan T. Dumais,et al.  To personalize or not to personalize: modeling queries with variation in user intent , 2008, SIGIR '08.

[10]  Enrique Alfonseca,et al.  Gazpacho and summer rash: lexical relationships from temporal patterns of web search queries , 2009, EMNLP.

[11]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..

[12]  Karen Holtzblatt,et al.  Contextual design , 1997, INTR.

[13]  Junghoo Cho,et al.  Page quality: in search of an unbiased web ranking , 2005, SIGMOD '05.

[14]  Luis Gravano,et al.  Answering General Time-Sensitive Queries , 2012, IEEE Trans. Knowl. Data Eng..

[15]  Sandeep Pandey,et al.  Recrawl scheduling based on information longevity , 2008, WWW.

[16]  Susan T. Dumais,et al.  Leveraging temporal dynamics of document content in relevance ranking , 2010, WSDM '10.

[17]  Steve Chien,et al.  Semantic similarity between search engine queries using temporal correlation , 2005, WWW '05.

[18]  Jeremy Ginsberg,et al.  Detecting influenza epidemics using search engine query data , 2009, Nature.

[19]  Miles Efron,et al.  Linear time series models for term weighting in information retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[20]  Susan T. Dumais,et al.  The web changes everything: understanding the dynamics of web content , 2009, WSDM '09.

[21]  Michael Gertz,et al.  Clustering of search results using temporal attributes , 2006, SIGIR.

[22]  Dimitrios Gunopulos,et al.  Identifying similarities, periodicities and bursts for online search queries , 2004, SIGMOD '04.

[23]  Fernando Diaz,et al.  Temporal profiles of queries , 2007, TOIS.

[24]  Ophir Frieder,et al.  Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.

[25]  Christopher Olston,et al.  What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.

[26]  KleinbergJon Bursty and Hierarchical Structure in Streams , 2003 .

[27]  Ji-Rong Wen,et al.  A large-scale evaluation and analysis of personalized search strategies , 2007, WWW '07.