ClusCite: effective citation recommendation by information network-based clustering

Citation recommendation is an interesting but challenging research problem. Most existing studies assume that all papers adopt the same criterion and follow the same behavioral pattern in deciding relevance and authority of a paper. However, in reality, papers have distinct citation behavioral patterns when looking for different references, depending on paper content, authors and target venues. In this study, we investigate the problem in the context of heterogeneous bibliographic networks and propose a novel cluster-based citation recommendation framework, called ClusCite, which explores the principle that citations tend to be softly clustered into interest groups based on multiple types of relationships in the network. Therefore, we predict each query's citations based on related interest groups, each having its own model for paper authority and relevance. Specifically, we learn group memberships for objects and the significance of relevance features for each interest group, while also propagating relative authority between objects, by solving a joint optimization problem. Experiments on both DBLP and PubMed datasets demonstrate the power of the proposed approach, with 17.68% improvement in Recall@50 and 9.57% growth in MRR over the best performing baseline.

[1]  C. Lee Giles,et al.  Popularity Weighted Ranking for Academic Digital Libraries , 2007, ECIR.

[2]  Jiawei Han,et al.  Citation Prediction in Heterogeneous Bibliographic Networks , 2012, SDM.

[3]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[4]  Jie Tang,et al.  ArnetMiner: extraction and mining of academic social networks , 2008, KDD.

[5]  Jiawei Han,et al.  Ranking-based classification of heterogeneous information networks , 2011, KDD.

[6]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[7]  Yizhou Sun,et al.  Personalized entity recommendation: a heterogeneous information network approach , 2014, WSDM.

[8]  Philip S. Yu,et al.  PathSim , 2011, Proc. VLDB Endow..

[9]  Chris H. Q. Ding,et al.  Collaborative Filtering: Weighted Nonnegative Matrix Factorization Incorporating User and Item Graphs , 2010, SDM.

[10]  Jure Leskovec,et al.  Supervised random walks: predicting and recommending links in social networks , 2010, WSDM '11.

[11]  Soumen Chakrabarti,et al.  Learning to rank networked entities , 2006, KDD '06.

[12]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[13]  Daniel Kifer,et al.  Context-aware citation recommendation , 2010, WWW '10.

[14]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[15]  Daniel Jurafsky,et al.  Who should I cite: learning literature search models from citation behavior , 2010, CIKM.

[16]  Wenyi Huang,et al.  Recommending citations: translating papers into references , 2012, CIKM.

[17]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[18]  Vagelis Hristidis,et al.  ObjectRank: Authority-Based Keyword Search in Databases , 2004, VLDB.

[19]  Yifan Hu,et al.  Collaborative Filtering for Implicit Feedback Datasets , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[20]  W. Bruce Croft,et al.  Recommending citations for academic papers , 2007, SIGIR.

[21]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[22]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[23]  Chun Chen,et al.  Personalized tag recommendation using graph-based ranking on multi-type interrelated objects , 2009, SIGIR.

[24]  Jie Tang,et al.  A Discriminative Approach to Topic-Based Citation Recommendation , 2009, PAKDD.

[25]  Nitesh V. Chawla,et al.  Link Prediction and Recommendation across Heterogeneous Social Networks , 2012, 2012 IEEE 12th International Conference on Data Mining.

[26]  Yizhou Sun,et al.  Ranking-based clustering of heterogeneous information networks with star network schema , 2009, KDD.

[27]  Wei Tang,et al.  Supervised Link Prediction Using Multiple Sources , 2010, 2010 IEEE International Conference on Data Mining.