On Constructing Seminal Paper Genealogy

Let us consider that someone is starting a research on a topic that is unfamiliar to them. Which seminal papers have influenced the topic the most? What is the genealogy of the seminal papers in this topic? These are the questions that they can raise, which we try to answer in this paper. First, we propose an algorithm that finds a set of seminal papers on a given topic. We also address the performance and scalability issues of this sophisticated algorithm. Next, we discuss the measures to decide how much a paper is influenced by another paper. Then, we propose an algorithm that constructs a genealogy of the seminal papers by using the influence measure and citation information. Finally, through extensive experiments with a large volume of a real-world academic literature data, we show the effectiveness and efficiency of our approach.

[1]  Chong Wang,et al.  Collaborative topic modeling for recommending scientific articles , 2011, KDD.

[2]  Steffen Bickel,et al.  Unsupervised prediction of citation influences , 2007, ICML '07.

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[5]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[6]  Michael Ley,et al.  DBLP - Some Lessons Learned , 2009, Proc. VLDB Endow..

[7]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[8]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[9]  Hongbo Deng,et al.  Enhanced Models for Expertise Retrieval Using Community-Aware Strategies , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[10]  George Karypis,et al.  Multilevel k-way Partitioning Scheme for Irregular Graphs , 1998, J. Parallel Distributed Comput..

[11]  Christos Faloutsos,et al.  Fast Random Walk with Restart and Its Applications , 2006, Sixth International Conference on Data Mining (ICDM'06).

[12]  Wei-Ying Ma,et al.  Object-level ranking: bringing order to Web objects , 2005, WWW '05.

[13]  Yizhou Sun,et al.  P-Rank: a comprehensive structural similarity measure over information networks , 2009, CIKM.

[14]  B. S. Robinson Number 9 , November 2005 Toward an Optimal Algorithm for Matrix Multiplication , 2005 .

[15]  M. M. Kessler Bibliographic coupling between scientific papers , 1963 .

[16]  Seok-Ho Yoon,et al.  On computing text-based similarity in scientific literature , 2011, WWW.

[17]  Maria Soledad Pera,et al.  A personalized recommendation system on scholarly publications , 2011, CIKM '11.

[18]  Jian Pei,et al.  Understanding Importance of Collaborations in Co-authorship Networks: A Supportiveness Analysis Approach , 2009, SDM.

[19]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[20]  Christos Faloutsos,et al.  Constructing seminal paper genealogy , 2011, CIKM '11.

[21]  Soe-Tsyr Yuan,et al.  Ontology-based structured cosine similarity in document summarization: with applications to mobile audio-based knowledge management , 2005, IEEE Trans. Syst. Man Cybern. Part B.

[22]  Sunju Park,et al.  A link-based similarity measure for scientific literature , 2010, WWW '10.

[23]  Dongwon Lee,et al.  Toward alternative measures for ranking venues: a case of database research community , 2007, JCDL '07.

[24]  Jimeng Sun,et al.  Neighborhood formation and anomaly detection in bipartite graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[25]  Sergei Maslov,et al.  Ranking scientific publications using a model of network traffic , 2006, ArXiv.

[26]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[27]  Jiawei Han,et al.  Mining advisor-advisee relationships from research publication networks , 2010, KDD.

[28]  C. Lee Giles,et al.  CiteSeer: an automatic citation indexing system , 1998, DL '98.

[29]  Andreas Thor,et al.  Citation analysis of database publications , 2005, SGMD.

[30]  Antal van den Bosch,et al.  Recommending scientific articles using citeulike , 2008, RecSys '08.

[31]  Jennifer Widom,et al.  SimRank: a measure of structural-context similarity , 2002, KDD.