Towards a Query-Less News Search Framework on Twitter

Twitter enables users to browse and access the latest news-related content. However, given user’s interest in a particular news-related tweet, searching for related content may be a tedious process. Formulating an effective search query is not a trivial task. And due to the often small size of smart phone screens, instead of typing, users always prefer click-based operations to retrieve related content. To address these issues, we introduce a new paradigm for news-related Twitter search called Search by Tweet(SbT). In this paradigm, a user submits a particular tweet which triggers a search task to retrieve further related tweets. In this paper, we formalize the SbT problem and propose an effective and efficient framework implementing such a functionality. At the core, we model the public Twitter stream as a dynamic graph-of-words, reflecting the importance of both words and word correlations. Given an input tweet, our framework utilizes the graph model to generate an implicit query. Our techniques demonstrate high efficiency and effectiveness as evaluated using a large-scale Twitter dataset and a user study.

[1]  Laks V. S. Lakshmanan,et al.  Incremental cluster evolution tracking from highly dynamic network data , 2014, 2014 IEEE 30th International Conference on Data Engineering.

[2]  Jianfeng Gao,et al.  Query expansion using path-constrained random walks , 2013, SIGIR.

[3]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[4]  Kun-Lung Wu,et al.  Efficient processing of streaming graphs for evolution-aware clustering , 2013, CIKM.

[5]  Kazuhiro Seki,et al.  Improving pseudo-relevance feedback via tweet selection , 2013, CIKM.

[6]  Min Zhang,et al.  Automatic online news topic ranking using media focus and user attention based on aging theory , 2008, CIKM '08.

[7]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[8]  W. Bruce Croft,et al.  Diversifying query suggestions based on query documents , 2014, SIGIR.

[9]  Krithi Ramamritham,et al.  Real Time Discovery of Dense Clusters in Highly Dynamic Graphs: Identifying Real World Events in Highly Dynamic Environments , 2012, Proc. VLDB Endow..

[10]  Miles Efron,et al.  Estimation methods for ranking recent information , 2011, SIGIR.

[11]  Dimitrios Gunopulos,et al.  On burstiness-aware search for document sequences , 2009, KDD.

[12]  M. de Rijke,et al.  Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts , 2011, ECIR.

[13]  James Allan,et al.  Entity query feature expansion using knowledge base links , 2014, SIGIR.

[14]  Philip S. Yu,et al.  On Clustering Graph Streams , 2010, SDM.

[15]  Hongfei Yan,et al.  Comparing Twitter and Traditional Media Using Topic Models , 2011, ECIR.

[16]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[18]  Meredith Ringel Morris,et al.  #TwitterSearch: a comparison of microblog search and web search , 2011, WSDM '11.

[19]  Qi He,et al.  Using Burstiness to Improve Clustering of Topics in News Streams , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).