On a formula for the h-index

The h-index is a celebrated indicator widely used to assess the quality of researchers and organizations. Empirical studies support the fact that the h-index is well correlated with other simple bibliometric indicators, such as the total number of publications N and the total number of citations C. In this paper we introduce a new formula h˜w=h˜w(N,C,cMAX), as a representative predictive formula that relates functionally h to these aggregate indicators, N, C and the highest citation count cMAX. The formula is based on the ‘specific’ assumption of geometrically distributed citations, but provides a good estimate of the h-index for the general case. To empirically evaluate the adequacy of the fit of the proposed formula h˜w, an empirical study with 131 datasets (13,347 papers; 288,972 citations) was carried out. The overall fit (defined as the capacity of h˜w to reproduce the true value of h, for each single scientist) was remarkably accurate. The predicted value was within one of the actual value h for more than 60% of the datasets. We found, in approximately three cases out of four, an absolute error less than or equal to 2, and an average absolute error of only 1.9, for the whole sample of datasets.

[1]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[2]  Bárbara S. Lancho-Barrantes,et al.  The iceberg hypothesis revisited , 2010, Scientometrics.

[3]  Quentin L. Burrell,et al.  The h-index: A case of the tail wagging the dog? , 2013, J. Informetrics.

[4]  M. Sales-Pardo,et al.  Effectiveness of Journal Ranking Schemes as a Tool for Locating Information , 2008, PloS one.

[5]  Aggelos Bletsas,et al.  Hirsch index rankings require scaling and higher moment , 2009 .

[6]  Juan E. Iglesias,et al.  Scaling the h-index for different scientific ISI fields , 2006, Scientometrics.

[7]  András Schubert,et al.  Hirsch-type indices for characterizing networks , 2009, Scientometrics.

[8]  D. Sornette,et al.  Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[9]  Matjaz Perc,et al.  Zipf's law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia's research as an example , 2010, J. Informetrics.

[10]  Wolfgang Glänzel,et al.  A systematic analysis of Hirsch-type indices for journals , 2007, J. Informetrics.

[11]  Wolfgang Glänzel,et al.  Characteristic scores and scales: A bibliometric analysis of subject characteristics based on long-term citation observation , 2007, J. Informetrics.

[12]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[13]  Leo Egghe,et al.  Thoughts on uncitedness: Nobel laureates and Fields medalists as case studies , 2011, J. Assoc. Inf. Sci. Technol..

[14]  E K Lenzi,et al.  q-exponential distribution in urban agglomeration. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Leo Egghe,et al.  An informetric model for the Hirsch-index , 2006, Scientometrics.

[16]  C. Shalizi Maximum Likelihood Estimation for q-Exponential (Tsallis) Distributions , 2007, math/0701854.

[17]  C. Tsallis,et al.  Are citations of scientific papers a case of nonextensivity? , 1999, cond-mat/9903433.

[18]  Gangan Prathap,et al.  The 100 most prolific economists using the p-index , 2010, Scientometrics.

[19]  Thierry Lafouge The source-item coverage of the exponential function , 2007, J. Informetrics.

[20]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[21]  W. Weibull A Statistical Distribution Function of Wide Applicability , 1951 .

[22]  Peter Vinkler,et al.  The πv-index: a new indicator to characterize the impact of journals , 2010, Scientometrics.

[23]  Anthony F. J. van Raan Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups , 2013, Scientometrics.

[24]  A. W. Kemp,et al.  Univariate Discrete Distributions , 1993 .

[25]  Vincent Larivière,et al.  Modeling a century of citation distributions , 2008, J. Informetrics.

[26]  Quentin L. Burrell The individual author’s publication–citation process: theory and practice , 2013, Scientometrics.

[27]  Wolfgang Glänzel,et al.  On the h-index - A mathematical approach to a new measure of publication activity and citation impact , 2006, Scientometrics.

[28]  Juan Miguel Campanario Distribution of ranks of articles and citations in journals , 2010 .

[29]  Lucio Bertoli-Barsotti,et al.  A New Bibliometric Index Based on the Shape of the Citation Distribution , 2014, PloS one.

[30]  Fred Y. Ye An investigation on mathematical models of the h-index , 2008, Scientometrics.

[31]  Chrisovaladis Malesios,et al.  Some variations on the standard theoretical models for the h‐index: A comparative analysis , 2015, J. Assoc. Inf. Sci. Technol..

[32]  Quentin L. Burrell,et al.  Modeling citation behavior in Management Science journals , 2006, Inf. Process. Manag..

[33]  Leo Egghe,et al.  Relations between the continuous and the discrete Lotka power function , 2005, J. Assoc. Inf. Sci. Technol..

[34]  Quentin L. Burrell,et al.  Extending Lotkaian informetrics , 2008, Inf. Process. Manag..

[35]  Tommaso Lando,et al.  A geometric model for the analysis of citation distributions , 2015 .

[36]  J. Hirsch Does the h index have predictive power? , 2007, Proceedings of the National Academy of Sciences.

[37]  Peter Taylor,et al.  Citation Statistics , 2009, ArXiv.

[38]  Quentin L. Burrell,et al.  Hirsch's h-index: A stochastic model , 2007, J. Informetrics.

[39]  Fred Y. Ye,et al.  A unification of three models for the h-index , 2011, J. Assoc. Inf. Sci. Technol..

[40]  S. Redner Citation statistics from 110 years of physical review , 2005, physics/0506056.

[41]  Ash Mohammad Abbas,et al.  Bounds and Inequalities Relating h-Index, g-Index, e-Index and Generalized Impact Factor: An Improvement over Existing Models , 2011, PloS one.

[42]  R. Rousseau,et al.  LOTKA: A program to fit a power law distribution to observed frequency data. , 2000 .

[43]  L. Egghe Power Laws in the Information Production Process: Lotkaian Informetrics , 2005 .

[44]  Sauro Succi,et al.  Statistical regularities in the rank-citation profile of scientists , 2011, Scientific reports.

[45]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[46]  Lucio Bertoli-Barsotti Improving a decomposition of the h-index , 2013, J. Assoc. Inf. Sci. Technol..

[47]  Leo Egghe,et al.  The Hirsch index of a shifted Lotka function and its relation with the impact factor , 2012, J. Assoc. Inf. Sci. Technol..

[48]  Anthony F. J. van Raan,et al.  Two-step competition process leads to quasi power-law income distributions , 2001 .

[49]  Aristoklis D. Anastasiadis,et al.  Tsallis q-exponential describes the distribution of scientific citations—a new characterization of the impact , 2008, Scientometrics.

[50]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[51]  Germinal Cocho,et al.  On the behavior of journal impact factor rank-order distribution , 2006, J. Informetrics.

[52]  Paul Nicholls,et al.  Estimation of Zipf parameters , 1987 .

[53]  Gangan Prathap,et al.  The zynergy‐index and the formula for the h‐index , 2014, J. Assoc. Inf. Sci. Technol..

[54]  Wolfgang Glänzel,et al.  On some new bibliometric applications of statistics related to the h-index , 2008, Scientometrics.

[55]  J. Laherrere Distributions de type fractal parabolique dans la Nature , 1996 .

[56]  S. Schneider,et al.  Expert credibility in climate change , 2010, Proceedings of the National Academy of Sciences.

[57]  Gangan Prathap,et al.  Is there a place for a mock h-index? , 2010, Scientometrics.

[58]  Quentin L. Burrell Formulae for the h-index: A lack of robustness in Lotkaian informetrics? , 2013, J. Assoc. Inf. Sci. Technol..

[59]  G. Cocho,et al.  Universality of Rank-Ordering Distributions in the Arts and Sciences , 2009, PloS one.

[60]  András Schubert,et al.  Hirsch-index for countries based on Essential Science Indicators data , 2007, Scientometrics.

[61]  Peter Vinkler,et al.  The ź-index , 2009 .