Analysis of bibliometric indicators for individual scholars in a large data set

Citation numbers and other quantities derived from bibliographic databases are becoming standard tools for the assessment of productivity and impact of research activities. Though widely used, still their statistical properties have not been well established so far. This is especially true in the case of bibliometric indicators aimed at the evaluation of individual scholars, because large-scale data sets are typically difficult to be retrieved. Here, we take advantage of a recently introduced large bibliographic data set, Google Scholar Citations, which collects the entire publication record of individual scholars. We analyze the scientific profile of more than 30,000 researchers, and study the relation between the h-index, the number of publications and the number of citations of individual scientists. While the number of publications of a scientist has a rather weak relation with his/her h-index, we find that the h-index of a scientist is strongly correlated with the number of citations that she/he has received so that the number of citations can be effectively be used as a proxy of the h-index. Allowing for the h-index to depend on both the number of citations and the number of publications, we find only a minor improvement.

[1]  Michael H. MacRoberts,et al.  Problems of citation analysis , 1992, Scientometrics.

[2]  H. C. Spruit The relative significance of the H-index , 2012, ArXiv.

[3]  Cyril Labbé Ike Antkare one of the great stars in the scientific firmament , 2010 .

[4]  Leo Egghe,et al.  An informetric model for the Hirsch-index , 2006, Scientometrics.

[5]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[6]  Francisco Herrera,et al.  h-Index: A review focused in its variants, computation and standardization for different scientific fields , 2009, J. Informetrics.

[7]  E Garfield,et al.  [The impact factor and its proper application]. , 1998, Der Unfallchirurg.

[8]  Lutz Bornmann,et al.  What do citation counts measure? A review of studies on citing behavior , 2008, J. Documentation.

[9]  Marta Sales-Pardo,et al.  Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal , 2010 .

[10]  Michael H. MacRoberts,et al.  Problems of citation analysis: A critical review , 1989, JASIS.

[11]  Walter A. Hendricks,et al.  The Sampling Distribution of the Coefficient of Variation , 1936 .

[12]  Juan E. Iglesias,et al.  Scaling the h-index for different scientific ISI fields , 2006, Scientometrics.

[13]  Rodrigo Costas,et al.  The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level , 2007, J. Informetrics.

[14]  Leo Egghe,et al.  The Hirsch index and related impact measures , 2010, Annu. Rev. Inf. Sci. Technol..

[15]  Claudio Castellano,et al.  A Reverse Engineering Approach to the Suppression of Citation Biases Reveals Universal Properties of Citation Distributions , 2012, PloS one.

[16]  Harry Eugene Stanley,et al.  Persistence and uncertainty in the academic career , 2012, Proceedings of the National Academy of Sciences.

[17]  Woo-Sung Jung,et al.  Quantitative and empirical demonstration of the Matthew effect in a study of career longevity , 2008, Proceedings of the National Academy of Sciences.

[18]  Péter Jacsó Visualizing overlap and rank differences among web-wide search engines: Some free tools and services , 2005, Online Inf. Rev..

[19]  P. Davis,et al.  Faculty Ratings of Major Economics Departments by Citations , 1984 .

[20]  魏屹东,et al.  Scientometrics , 2018, Encyclopedia of Big Data.

[21]  Santo Fortunato,et al.  Diffusion of scientific credits and the ranking of scientists , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[22]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[23]  P. Jacsó As we may search : Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases , 2005 .

[24]  M. Sales-Pardo,et al.  Effectiveness of Journal Ranking Schemes as a Tool for Locating Information , 2008, PloS one.

[25]  L. Egghe,et al.  Theory and practise of the g-index , 2006, Scientometrics.

[26]  A. Kinney National scientific facilities and their science impact on nonbiomedical research , 2007, Proceedings of the National Academy of Sciences.

[27]  D. Sornette,et al.  Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[28]  Rodrigo Costas,et al.  Is g-index better than h-index? An exploratory study at the individual level , 2008, Scientometrics.

[29]  Lokman I. Meho,et al.  Impact of data sources on citation counts and rankings of LIS faculty: Web of science versus scopus and google scholar , 2007, J. Assoc. Inf. Sci. Technol..

[30]  Péter Jacsó,et al.  Metadata mega mess in Google Scholar , 2010, Online Inf. Rev..

[31]  J. E. Hirsch,et al.  The meaning of the h-index , 2014 .

[32]  H. Stanley,et al.  Methods for measuring the citations and productivity of scientists across time and discipline. , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[33]  Carl T. Bergstrom,et al.  Big Macs and Eigenfactor scores: Don't let correlation coefficients fool you , 2009, J. Assoc. Inf. Sci. Technol..

[34]  Peter Taylor,et al.  Citation Statistics , 2009, ArXiv.

[35]  Anthony F. J. van Raan Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups , 2013, Scientometrics.

[36]  Judit Bar-Ilan,et al.  Which h-index? — A comparison of WoS, Scopus and Google Scholar , 2008, Scientometrics.

[37]  Anne-Wil Harzing,et al.  Google Scholar as a new source for citation analysis , 2008 .

[38]  Lutz Bornmann,et al.  Selecting scientific excellence through committee peer review - A citation analysis of publications previously published to approval or rejection of post-doctoral research fellowship applicants , 2006, Scientometrics.

[39]  Stelios Psarakis,et al.  Categorizing h-index variants , 2011 .

[40]  L. Bornmann,et al.  Does the Committee Peer Review Select the Best Applicants for Funding? An Investigation of the Selection Process for Two European Molecular Biology Organization Programmes , 2008, PloS one.

[41]  Lucio Barabesi,et al.  Statistical analysis of the Hirsch Index , 2011, ArXiv.

[42]  Sauro Succi,et al.  Statistical regularities in the rank-citation profile of scientists , 2011, Scientific reports.

[43]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[44]  Wolfgang Glänzel,et al.  On the h-index - A mathematical approach to a new measure of publication activity and citation impact , 2006, Scientometrics.

[45]  A. D. Jackson,et al.  Measures for measures , 2006, Nature.

[46]  Guillaume Cabanac,et al.  Experimenting with the partnership ability φ-index on a million computer scientists , 2013, Scientometrics.

[47]  Wolfgang Glänzel,et al.  A systematic analysis of Hirsch-type indices for journals , 2007, J. Informetrics.

[48]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[49]  C. Durniak,et al.  Soliton interaction in a complex plasma. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  Vincent Larivière,et al.  Modeling a century of citation distributions , 2008, J. Informetrics.

[51]  James Hartley To cite or not to cite: author self-citations and the impact factor , 2011, Scientometrics.

[52]  Martin Rosvall,et al.  Maps of random walks on complex networks reveal community structure , 2007, Proceedings of the National Academy of Sciences.

[53]  Eugene Garfield,et al.  The Impact Factor and Using It Correctly , 2002 .