Twitter has become a major outlet for news, discussion and commentary of on-going events and trends. Effective searching of Twitter collections poses a number of issues for traditional document-based information retrieval (IR) approaches, such as limited document term statistics and spam. In this paper we propose a novel approach to pseudo-relevance feedback, based upon the temporal profiles of n-grams extracted from the top N relevance feedback tweets. A weighted graph is used to model temporal correlation between n-grams, with a PageRank variant employed to combine both pseudo-relevant document term distribution and temporal collection evidence. Preliminary experiments with the TREC Microblogging 2011 Twitter corpus indicate that through parameter optimisation, retrieval effectiveness can be improved.
[1]
M. de Rijke,et al.
Incorporating Query Expansion and Quality Indicators in Searching Microblog Posts
,
2011,
ECIR.
[2]
Miles Efron.
Linear time series models for term weighting in information retrieval
,
2010
.
[3]
Padhraic Smyth,et al.
Algorithms for estimating relative importance in networks
,
2003,
KDD '03.
[4]
Meredith Ringel Morris,et al.
#TwitterSearch: a comparison of microblog search and web search
,
2011,
WSDM '11.
[5]
Miles Efron,et al.
Information search and retrieval in microblogs
,
2011,
J. Assoc. Inf. Sci. Technol..
[6]
Joemon M. Jose,et al.
Exploring term temporality for pseudo-relevance feedback
,
2011,
SIGIR.