论文信息 - Diversified ranking on large graphs: an optimization viewpoint - 字舞流文

Diversified ranking on large graphs: an optimization viewpoint

Diversified ranking on graphs is a fundamental mining task and has a variety of high-impact applications. There are two important open questions here. The first challenge is the measure - how to quantify the goodness of a given top-k ranking list that captures both the relevance and the diversity? The second challenge lies in the algorithmic aspect - how to find an optimal, or near-optimal, top-k ranking list that maximizes the measure we defined in a scalable way? In this paper, we address these challenges from an optimization point of view. Firstly, we propose a goodness measure for a given top-k ranking list. The proposed goodness measure intuitively captures both (a) the relevance between each individual node in the ranking list and the query; and (b) the diversity among different nodes in the ranking list. Moreover, we propose a scalable algorithm (linear wrt the size of the graph) that generates a provably near-optimal solution. The experimental evaluations on real graphs demonstrate its effectiveness and efficiency.

Jingrui He | Hanghang Tong | Ching-Yung Lin | Zhen Wen | Ravi B. Konuru | Ravi Konuru | Jingrui He | Hanghang Tong | Ching-Yung Lin | Zhen Wen

[1] Xiaojin Zhu,et al. Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[2] Yehuda Koren,et al. Measuring and extracting proximity in networks , 2006, KDD '06.

[3] Andrei Z. Broder,et al. Estimating rates of rare events at multiple resolutions , 2007, KDD '07.

[4] Tanya Y. Berger-Wolf,et al. Sampling community structure , 2010, WWW '10.

[5] Purnamrita Sarkar,et al. Fast nearest-neighbor search in disk-resident graphs , 2010, KDD.

[6] Christos Faloutsos,et al. Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7] David Liben-Nowell,et al. The link-prediction problem for social networks , 2007 .

[8] Taher H. Haveliwala. Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[9] Surajit Chaudhuri,et al. Ranking objects based on relationships and fixed associations , 2009, EDBT '09.

[10] Dafna Shahaf,et al. Turning down the noise in the blogosphere , 2009, KDD.

[11] Sean M. McNee,et al. Improving recommendation lists through topic diversification , 2005, WWW '05.

[12] Dragomir R. Radev,et al. DivRank: the interplay of prestige and diversity in information networks , 2010, KDD.

[13] Lynn Wu,et al. Social Network Effects on Performance and Layoffs: Evidence from the Adoption of a Social Networking Tool , 2011, ICIS.

[14] Yehuda Koren,et al. Collaborative filtering with temporal dynamics , 2009, KDD.

[15] Yong Yu,et al. Enhancing diversity, coverage and balance for summarization through structure learning , 2009, WWW '09.

[16] Srinivasan Parthasarathy,et al. Scalable graph clustering using stochastic flows: applications to community discovery , 2009, KDD.

[17] Thorsten Joachims,et al. Predicting diverse subsets using structural SVMs , 2008, ICML '08.

[18] Heikki Mannila,et al. Relational link-based ranking , 2004, VLDB.

[19] Rajeev Motwani,et al. The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[20] Jimeng Sun,et al. Social action tracking via noise tolerant time-varying factor graphs , 2010, KDD.

[21] G. Karypis,et al. Multilevel k-way hypergraph partitioning , 1999, Proceedings 1999 Design Automation Conference (Cat. No. 99CH36361).

[22] Andreas Krause,et al. Cost-effective outbreak detection in networks , 2007, KDD '07.

[23] Jure Leskovec,et al. Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[24] R. Pemantle. Vertex-reinforced random walk , 1992, math/0404041.

[25] Chao Liu,et al. BBM: bayesian browsing model from petabyte-scale data , 2009, KDD.

[26] Philip S. Yu,et al. Cross-relational clustering with user's guidance , 2005, KDD '05.

[27] Hongyan Liu,et al. Fast Single-Pair SimRank Computation , 2010, SDM.

[28] Jon Kleinberg,et al. The link prediction problem for social networks , 2003, CIKM '03.

[29] Christos Faloutsos,et al. PEGASUS: A Peta-Scale Graph Mining System Implementation and Observations , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[30] David Maxwell Chickering,et al. Dependency Networks for Collaborative Filtering and Data Visualization , 2000, UAI.

[31] Jian Pei,et al. Neighbor query friendly compression of social networks , 2010, KDD.

[32] Tina Eliassi-Rad,et al. Evaluating Statistical Tests for Within-Network Classifiers of Relational Data , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[33] G. Stewart,et al. Matrix Perturbation Theory , 1990 .

[34] Jade Goldstein-Stewart,et al. The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[35] Diane J. Cook,et al. Graph-based anomaly detection , 2003, KDD '03.

[36] Filip Radlinski,et al. Redundancy, diversity and interdependent document relevance , 2009, SIGF.

[37] Theodoros Lappas,et al. Finding a team of experts in social networks , 2009, KDD.

[38] Nitesh V. Chawla,et al. New perspectives and methods in link prediction , 2010, KDD.

[39] Arindam Banerjee,et al. Generalized Probabilistic Matrix Factorizations for Collaborative Filtering , 2010, 2010 IEEE International Conference on Data Mining.

[40] Hui Xiong,et al. An energy-efficient mobile recommender system , 2010, KDD.

[41] Jiawei Han,et al. Mining Compressed Frequent-Pattern Sets , 2005, VLDB.