Context-aware citation recommendation

When you write papers, how many times do you want to make some citations at a place but you are not sure which papers to cite? Do you wish to have a recommendation system which can recommend a small number of good candidates for every place that you want to make some citations? In this paper, we present our initiative of building a context-aware citation recommendation system. High quality citation recommendation is challenging: not only should the citations recommended be relevant to the paper under composition, but also should match the local contexts of the places citations are made. Moreover, it is far from trivial to model how the topic of the whole paper and the contexts of the citation places should affect the selection and ranking of citations. To tackle the problem, we develop a context-aware approach. The core idea is to design a novel non-parametric probabilistic model which can measure the context-based relevance between a citation context and a document. Our approach can recommend citations for a context effectively. Moreover, it can recommend a set of citations for a paper with high quality. We implement a prototype system in CiteSeerX. An extensive empirical evaluation in the CiteSeerX digital library against many baselines demonstrates the effectiveness and the scalability of our approach.

[1]  Martha Burkle E-learning Research , 2009 .

[2]  Anna Ritchie,et al.  Citation context analysis for information retrieval , 2009 .

[3]  C. J. van Rijsbergen,et al.  The geometry of information retrieval , 2004 .

[4]  Massimo Melucci,et al.  A basis for information retrieval in context , 2008, TOIS.

[5]  G. D’Ariano,et al.  Maximum-likelihood estimation of the density matrix , 1999, quant-ph/9909052.

[6]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[7]  John D. Lafferty,et al.  Diffusion Kernels on Statistical Manifolds , 2005, J. Mach. Learn. Res..

[8]  Fan Wang,et al.  A Survey on Reviewer Assignment Problem , 2008, IEA/AIE.

[9]  Jie Tang,et al.  A Discriminative Approach to Topic-Based Citation Recommendation , 2009, PAKDD.

[10]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[11]  W. Bruce Croft,et al.  Recommending citations for academic papers , 2007, SIGIR.

[12]  Wei-Ying Ma,et al.  TSSP: A Reinforcement Algorithm to Find Related Papers , 2004, IEEE/WIC/ACM International Conference on Web Intelligence (WI'04).

[13]  A. Gleason Measures on the Closed Subspaces of a Hilbert Space , 1957 .

[14]  Hiep Phuc Luong,et al.  Concept-Based Document Recommendations for CiteSeer Authors , 2008, AH.

[15]  Alan Jeffrey,et al.  Handbook of mathematical formulas and integrals , 1995 .

[16]  Shenghuo Zhu,et al.  Learning multiple graphs for document recommendations , 2008, WWW.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  William W. Cohen,et al.  Recommendation : A Study in Combining Multiple Information Sources , 2007 .

[19]  Jon M. Kleinberg,et al.  The link-prediction problem for social networks , 2007, J. Assoc. Inf. Sci. Technol..

[20]  Sean M. McNee,et al.  Enhancing digital libraries with TechLens , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[21]  Thorsten Joachims,et al.  Identifying the original contribution of a document via language modeling , 2009, ECML/PKDD.

[22]  Sean M. McNee,et al.  On the recommending of citations for research papers , 2002, CSCW '02.

[23]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Thorsten Joachims,et al.  Citation Classification And Its Applications , 2005 .

[25]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[26]  Fernando Diaz,et al.  Regularizing ad hoc retrieval scores , 2005, CIKM '05.

[27]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.