Go Wide, Go Deep: Quantifying the Impact of Scientific Papers Through Influence Dispersion Trees

Despite a long history of the use of 'citation count' as a measure of scientific impact, the evolution of the follow-up work inspired by the paper and their interactions through citation links have rarely been explored to quantify how the paper enriches the depth and breadth of a research field. We propose a novel data structure, called Influence Dispersion Tree (IDT), to model the organization of follow-up papers and their dependencies through citations. We also propose the notion of an ideal IDT for every paper and show that an ideal (highly influential) paper should increase the knowledge of a field vertically and horizontally. We study the structural properties of IDT (both theoretically and empirically) and propose two metrics, namely Influence Dispersion Index (IDI) and Normalized Influence Divergence (NID) to quantify the influence of a paper. Our theoretical analysis shows that an ideal IDT configuration should have equal depth and breadth (and thus minimize the NID value). We establish the superiority of NID as a better influence measure in two experimental settings. First, on a large real-world bibliographic dataset, we show that NID outperforms raw citation count as an early predictor of the number of new citations a paper will receive within a certain period after publication. Second, we show that NID is superior to the raw citation count at identifying the papers recognized as highly influential through 'Test of Time Award' among all their contemporary papers (published in the same venue)

[1]  Alan Fersht,et al.  The most influential journals: Impact Factor and Eigenfactor , 2009, Proceedings of the National Academy of Sciences.

[2]  Chaomei Chen,et al.  Cascading Citation Expansion , 2018, ArXiv.

[3]  Wei Lu,et al.  Number versus structure: towards citing cascades , 2018, Scientometrics.

[4]  George S. Howard,et al.  Research Productivity in Counseling Psychology: An Update and Generalization Study. , 1983 .

[5]  M. Nahata,et al.  New indices in scholarship assessment. , 2009, American journal of pharmaceutical education.

[6]  Judit Bar-Ilan,et al.  Coverage and adoption of altmetrics sources in the bibliometric community , 2014, Scientometrics.

[7]  Duncan J. Watts,et al.  The Structural Virality of Online Diffusion , 2015, Manag. Sci..

[8]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[9]  Ying Ding,et al.  Quantifying the evolution of citation cascades , 2017, ASIST.

[10]  Stanislaw Sobotka,et al.  c-index and Subindices of the h-index: New Variants of the h-index to Account for Variations in Author Contribution , 2018, Cureus.

[11]  Rediet Abebe Can Cascades be Predicted? , 2014 .

[12]  E Garfield,et al.  "Science Citation Index"--A New Dimension in Indexing. , 1964, Science.

[13]  Maria Liakata,et al.  Measuring scientific impact beyond academia: An assessment of existing impact metrics and proposed improvements , 2017, PloS one.

[14]  Frank M. Bass,et al.  A New Product Growth for Model Consumer Durables , 2004, Manag. Sci..

[15]  Chun-Ting Zhang,et al.  The e-Index, Complementing the h-Index for Excess Citations , 2009, PloS one.

[16]  Jiang Li,et al.  Innovation or imitation: The diffusion of citations , 2018, J. Assoc. Inf. Sci. Technol..

[17]  E. Garfield The history and meaning of the journal impact factor. , 2006, JAMA.

[18]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[19]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[20]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[21]  Hugues Bersini,et al.  Genealogical Trees of Scientific Papers , 2016, PloS one.

[22]  Subrata Nandi,et al.  $$C^3$$C3-index: a PageRank based multi-faceted metric for authors’ performance measurement , 2016, Scientometrics.

[23]  Facundo Mémoli,et al.  Gromov–Wasserstein Distances and the Metric Approach to Object Matching , 2011, Found. Comput. Math..

[24]  L. Egghe An improvement of the h-index: the g-index , 2006 .

[25]  W. Couldwell,et al.  Use of the h index in neurosurgery. Clinical article. , 2009, Journal of neurosurgery.

[26]  Andrej A Romanovsky Revised h index for biomedical research , 2012, Cell cycle.

[27]  Lutz Bornmann,et al.  A multilevel meta-analysis of studies reporting correlations between the h index and 37 different h index variants , 2011, J. Informetrics.

[28]  John P. A. Ioannidis,et al.  Measuring Co-Authorship and Networking-Adjusted Scientific Impact , 2008, PloS one.

[29]  Animesh Mukherjee,et al.  The Role Of Citation Context In Predicting Long-Term Citation Profiles: An Experimental Study Based On A Massive Bibliographic Text Dataset , 2015, CIKM.

[30]  Daniel Lemire,et al.  Measuring academic influence: Not all citations are equal , 2015, J. Assoc. Inf. Sci. Technol..

[31]  Animesh Mukherjee,et al.  Relay-Linking Models for Prominence and Obsolescence in Evolving Networks , 2016, KDD.

[32]  Tanmoy Chakraborty,et al.  All Fingers are not Equal: Intensity of References in Scientific Articles , 2016, EMNLP.

[33]  Chaomei Chen,et al.  Tracing knowledge diffusion , 2004, Scientometrics.

[34]  James Caverlee,et al.  PageRank for ranking authors in co-citation networks , 2009, J. Assoc. Inf. Sci. Technol..

[35]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[36]  Peter Ingwersen,et al.  Informetric analyses on the world wide web: methodological approaches to 'webometrics' , 1997, J. Documentation.

[37]  Subrata Nandi,et al.  Universal trajectories of scientific success , 2017, Knowledge and Information Systems.

[38]  Johan Bollen,et al.  Mapping the structure of science through usage , 2006, Scientometrics.