Microblog retrieval based on term similarity graph

In this paper, we introduce a term similarity graph based microblog retrieval model. We propose an undirected weighted graph to describe each post where nodes represent terms and they are linked together based on the co-occurrences. Then an electric network is applied to calculate the resistance distances between pairs of terms. For a given query topic, mutual relevancy in between query terms and between term groups form the expansions to better describe the query. We also define a scoring and ranking model to improve search over large set of microblogs by considering both expanded words and tweets special attributes (hyperlinks and publish time). Our experiments use Twitter 2011 two weeks' data offered by TREC 2011&2012 Microblog Track. Results show that the model better overcome tweets' limited length shortage than traditional methods, good expansion terms are found and microblog retrieval performance are greatly improved by the scoring and ranking model.