Result Diversification in Automatic Citation Recommendation

The increase in the number of published papers each year makes manual literature search inefficient and furthermore insufficient. Hence, automatized reference/citation recommendation have been of interest in the last 3-4 decades. Unfortunately, some of the developed approaches, such as keyword-based ones, are prone to ambiguity and synonymy. On the other hand, using the citation information does not suffer from the same problems since they do not consider textual similarity. Today, obtaining the desired information is as hard as looking for a needle in a haystack. And sometimes, we want that small haystack, e.g., a small result set containing only a few recommendations, cover all the important and relevant parts of the literature. That is, the set should be diversified enough. Here, we investigate the problem of result diversification in automatic citation recommendation. We enhance existing techniques, which were designed to recommend a set of citations with satisfactory quality and diversity, with direction-awareness to allow the users to reach either old, well-cited, well-known research papers or recent, less-known ones. We also propose some novel techniques for a better result diversification. Experimental results show that our techniques are very useful in automatic citation recommendation.

[1]  Xiaojin Zhu,et al.  Improving Diversity in Ranking using Absorbing Random Walks , 2007, NAACL.

[2]  Sean M. McNee,et al.  Improving recommendation lists through topic diversification , 2005, WWW '05.

[3]  Dragomir R. Radev,et al.  DivRank: the interplay of prestige and diversity in information networks , 2010, KDD.

[4]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[5]  Filip Radlinski,et al.  Improving personalized web search using result diversification , 2006, SIGIR.

[6]  David Liben-Nowell,et al.  The link-prediction problem for social networks , 2007 .

[7]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[8]  Jingrui He,et al.  Diversified ranking on large graphs: an optimization viewpoint , 2011, KDD.

[9]  Ben Carterette,et al.  An analysis of NP-completeness in novelty and diversity ranking , 2009, Information Retrieval.

[10]  Ümit V. Çatalyürek,et al.  Fast Recommendation on Bibliographic Networks , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[11]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[12]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[13]  Jeffrey Xu Yu,et al.  Scalable Diversified Ranking on Large Graphs , 2011, IEEE Transactions on Knowledge and Data Engineering.

[14]  Ümit V. Çatalyürek,et al.  Direction Awareness in Citation Recommendation , 2012 .