Recommending citations: translating papers into references

When we write or prepare to write a research paper, we always have appropriate references in mind. However, there are most likely references we have missed and should have been read and cited. As such a good citation recommendation system would not only improve our paper but, overall, the efficiency and quality of literature search. Usually, a citation's context contains explicit words explaining the citation. Using this, we propose a method that "translates" research papers into references. By considering the citations and their contexts from existing papers as parallel data written in two different "languages", we adopt the translation model to create a relationship between these two "vocabularies". Experiments on both CiteSeer and CiteULike dataset show that our approach outperforms other baseline methods and increase the precision, recall and f-measure by at least 5% to 10%, respectively. In addition, our approach runs much faster in the both training and recommending stage, which proves the effectiveness and the scalability of our work.

[1]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[2]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[3]  John D. Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR '99.

[4]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[5]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[6]  Sean M. McNee,et al.  On the recommending of citations for research papers , 2002, CSCW '02.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  W. Bruce Croft,et al.  Simple Translation Models for Sentence Retrieval in Factoid Question Answering , 2004 .

[9]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[10]  J. Lafferty,et al.  Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[11]  W. Bruce Croft,et al.  A Translation Model for Sentence Retrieval , 2005, HLT.

[12]  W. Bruce Croft,et al.  Recommending citations for academic papers , 2007, SIGIR.

[13]  Stephen E. Robertson,et al.  Using Terms from Citations for IR: Some First Results , 2008, ECIR.

[14]  Stephen E. Robertson,et al.  Comparing citation contexts for information retrieval , 2008, CIKM '08.

[15]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[16]  Jie Tang,et al.  A Discriminative Approach to Topic-Based Citation Recommendation , 2009, PAKDD.

[17]  Daniel Kifer,et al.  Context-aware citation recommendation , 2010, WWW '10.

[18]  Prasenjit Mitra,et al.  Utilizing Context in Generative Bayesian Models for Linked Corpus , 2010, AAAI.

[19]  Cornelia Caragea,et al.  Context Sensitive Topic Models for Author Influence in Document Networks , 2011, IJCAI.

[20]  Zhiyuan Liu,et al.  A Simple Word Trigger Method for Social Tag Suggestion , 2011, EMNLP.

[21]  Hongfei Yan,et al.  Recommending citations with translation model , 2011, CIKM '11.