Network Analysis for Predicting Academic Impact

How are scholars ranked for promotion, tenure and honors? How can we improve the quantitative tools available for decision makers when making such decisions? Can we predict the academic impact of scholars and papers at early stages using quantitative tools? Current academic decisions (hiring, tenure, prizes) are mostly very subjective. In the era of “Big Data,” a solid quantitative set of measurements should be used to support this decision process. This paper presents a method for predicting the probability of a paper being in the most cited papers using only data available at the time of publication. We find that highly cited papers have different structural properties and that these centrality measures are associated with increased odds of being in the top percentile of citation count. The paper also presents a method for predicting the future impact of researchers, using information available early in their careers. This model integrates information about changes in a young researcher’s role in the citation network and co-authorship network and demonstrates how this improves predictions of their future impact. These results show that the use of quantitative methods can complement the qualitative decision-making process in academia and improve the prediction of academic impact.

[1]  Brendan T. O'Connor,et al.  Predicting a Scientific Community’s Response to an Article , 2011, EMNLP.

[2]  Stevan Harnad,et al.  Earlier Web Usage Statistics as Predictors of Later Citation Impact , 2005, J. Assoc. Inf. Sci. Technol..

[3]  Jonathan Adams,et al.  Early citation counts correlate with accumulated impact , 2005, Scientometrics.

[4]  J. Hirsch Does the h index have predictive power? , 2007, Proceedings of the National Academy of Sciences.

[5]  E. Garfield,et al.  Of Nobel class: A citation perspective on high impact research authors , 1992, Theoretical medicine.

[6]  Paul Benjamin Lowry,et al.  Profiling the Research Productivity of Tenured Information Systems Faculty at U.S. Institutions , 2011, MIS Q..

[7]  Jon M. Kleinberg,et al.  Overview of the 2003 KDD Cup , 2003, SKDD.

[8]  R. Burt Brokerage and Closure: An Introduction to Social Capital , 2005 .

[9]  Daniel G. Bachrach,et al.  Scholarly Influence in the Field of Management: A Bibliometric Analysis of the Determinants of University and Author Impact in the Management Literature in the Past Quarter Century , 2008 .

[10]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[11]  K. A. McKibbon,et al.  Prediction of citation counts for clinical articles at two years using data available within three weeks of publication: retrospective cohort study , 2008, BMJ : British Medical Journal.

[12]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[13]  Naoki Shibata,et al.  Topological analysis of citation networks to discover the future core articles , 2007, J. Assoc. Inf. Sci. Technol..

[14]  Thomas W. Valente Network models of the diffusion of innovations , 1996, Comput. Math. Organ. Theory.

[15]  Erik Brynjolfsson,et al.  NETWORK ANALYSIS FOR PREDICTING ACADEMIC IMPACT ResearchinProgress , 2013 .

[16]  Jeanne G. Harris,et al.  Competing on Analytics: The New Science of Winning , 2007 .

[17]  Lawrence D. Fu,et al.  Models for Predicting and Explaining Citation Count of Biomedical Articles , 2008, AMIA.

[18]  Steven B. Andrews,et al.  Structural Holes: The Social Structure of Competition , 1995, The SAGE Encyclopedia of Research Design.

[19]  Konrad Paul Kording,et al.  Future impact: Predicting scientific success , 2012, Nature.

[20]  Lutz Bornmann,et al.  Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine , 2008, J. Assoc. Inf. Sci. Technol..

[21]  L. Egghe,et al.  Theory and practise of the g-index , 2006, Scientometrics.

[22]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[23]  Anne-Wil Harzing,et al.  REFLECTIONS ON THE H-INDEX , 2012 .

[24]  Susan T. Dumais,et al.  Predicting Citation Counts Using Text and Graph Mining , 2013 .

[25]  Leonard M. Freeman,et al.  A set of measures of centrality based upon betweenness , 1977 .

[26]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[27]  Gunther Eysenbach,et al.  Can Tweets Predict Citations? Metrics of Social Impact Based on Twitter and Correlation with Traditional Metrics of Scientific Impact , 2011, Journal of medical Internet research.

[28]  M. Newman Coauthorship networks and patterns of scientific collaboration , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Janet Kleber,et al.  Sometimes the impact factor outshines the H index , 2008, Retrovirology.

[30]  Ingo Scholtes,et al.  Predicting scientific success based on coauthorship networks , 2014, EPJ Data Science.

[31]  Lorin M. Hitt,et al.  Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance? , 2011, ICIS 2011.

[32]  M. Sarvary,et al.  Network Effects and Personal Influences: The Diffusion of an Online Social Network , 2011 .

[33]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[34]  Concha Bielza,et al.  Predicting citation count of Bioinformatics papers within four years of publication , 2009, Bioinform..