Tweet Ranking Based on Heterogeneous Networks

Ranking tweets is a fundamental task to make it easier to distill the vast amounts of information shared by users. In this paper, we explore the novel idea of ranking tweets on a topic using heterogeneous networks. We construct heterogeneous networks by harnessing cross-genre linkages between tweets and semantically-related web documents from formal genres, and inferring implicit links between tweets and users. To rank tweets effectively by capturing the semantics and importance of different linkages, we introduce Tri-HITS, a model to iteratively propagate ranking scores across heterogeneous networks. We show that integrating both formal genre and inferred social networks with tweet networks produces a higher-quality ranking than the tweet networks alone. 1 Title and Abstract in Chinese u

[1]  Michael R. Lyu,et al.  A generalized Co-HITS algorithm and its application to bipartite graphs , 2009, KDD.

[2]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[3]  Tarek F. Abdelzaher,et al.  On truth discovery in social sensing: A maximum likelihood estimation approach , 2012, International Symposium on Information Processing in Sensor Networks.

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[6]  Harry Shum,et al.  An Empirical Study on Learning to Rank of Tweets , 2010, COLING.

[7]  Qi He,et al.  TwitterRank: finding topic-sensitive influential twitterers , 2010, WSDM '10.

[8]  Rada Mihalcea,et al.  Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization , 2004, ACL.

[9]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[10]  Fabio Massimo Zanzotto,et al.  Linguistic Redundancy in Twitter , 2011, EMNLP.

[11]  Charu C. Aggarwal,et al.  On Bayesian interpretation of fact-finding in information networks , 2011, 14th International Conference on Information Fusion.

[12]  Yi Yang,et al.  Quality-biased Ranking of Short Texts in Microblogging Services , 2011, IJCNLP.

[13]  Fernando Diaz,et al.  Time is of the essence: improving recency ranking using Twitter data , 2010, WWW '10.

[14]  Brian Uzzi,et al.  Synchronicity, instant messaging, and performance among financial traders , 2011, Proceedings of the National Academy of Sciences.

[15]  Jiawei Han,et al.  Evaluating Event Credibility on Twitter , 2012, SDM.

[16]  Hiroyuki Kitagawa,et al.  TURank: Twitter User Ranking Based on User-Tweet Graph Analysis , 2010, WISE.

[17]  Jugal K. Kalita,et al.  Experiments in Microblog Summarization , 2010, 2010 IEEE Second International Conference on Social Computing.

[18]  Jiawei Han,et al.  Heterogeneous network-based trust analysis: a survey , 2011, SKDD.

[19]  Jugal K. Kalita,et al.  Comparing Twitter Summarization Algorithms for Multiple Post Summaries , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[20]  John Hannon,et al.  Recommending twitter users to follow using content and collaborative filtering approaches , 2010, RecSys '10.

[21]  Martin R. Gibbs,et al.  Mediating intimacy: designing technologies to support strong-tie relationships , 2005, CHI.

[22]  I. Couzin Collective minds , 2007, Nature.

[23]  Mor Naaman,et al.  Finding and assessing social media information sources in the context of journalism , 2012, CHI.

[24]  O. Sornil,et al.  An Automatic Text Summarization Approach using Content-Based and Graph-Based Characteristics , 2006, 2006 IEEE Conference on Cybernetics and Intelligent Systems.

[25]  Ben Carterette,et al.  Probabilistic models of ranking novel documents for faceted topic retrieval , 2009, CIKM.

[26]  W. Bruce Croft,et al.  User oriented tweet ranking: a filtering approach to microblogs , 2011, CIKM '11.

[27]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[28]  Bo Zhao,et al.  Probabilistic topic models with biased propagation on heterogeneous information networks , 2011, KDD.

[29]  Ryan T. McDonald A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[30]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[31]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[32]  Julio Gonzalo,et al.  Towards real-time summarization of scheduled events from twitter streams , 2012, HT '12.

[33]  Scott A. Golder A Structural Approach to Contact Recommendations in Online Social Networks , 2009 .