On a heuristic point of view concerning the citation distribution: introducing the Wakeby distribution

AbstractThe paper proposes a heuristic approach to modeling the cumulative distribution of citations of papers in scientific journals by means of the Wakeby distribution. The Markov process of citation leading to the Wakeby distribution is analyzed using the terminal time formalism. The Wakeby distribution is derived in the paper from the simple and general inhomogeneous Choquet–Deny convolution equation for a non-probability measure. We give statistical evidence that the Wakeby distribution is a reasonable approximation of the empirical citation distributions. AMS Subject Classification: 91D30; 91D99

[1]  K. I. M. McKinnon,et al.  Solving Stochastic Ship Fleet Routing Problems with Inventory Management Using Branch and Price , 2016, Advances in Stochastic and Deterministic Global Optimization.

[2]  Marcel Ausloos,et al.  Zipf–Mandelbrot–Pareto model for co-authorship popularity , 2014, Scientometrics.

[3]  V. Zolotarev,et al.  Chance and Stability, Stable Distributions and Their Applications , 1999 .

[4]  Anthony F. J. van Raan,et al.  Two-step competition process leads to quasi power-law income distributions , 2001 .

[5]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[6]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[7]  Pedro Albarrán,et al.  References made and citations received by scientific articles , 2011, J. Assoc. Inf. Sci. Technol..

[8]  Star X. Zhao,et al.  Power-law link strength distribution in paper cocitation networks , 2013, J. Assoc. Inf. Sci. Technol..

[9]  D. Sornette,et al.  Stretched exponential distributions in nature and economy: “fat tails” with characteristic scales , 1998, cond-mat/9801293.

[10]  J. Malý,et al.  Integral Representation Theory: Applications to Convexity, Banach Spaces and Potential Theory , 2010 .

[11]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[12]  Les noyaux élémentaires , 1960 .

[13]  Loet Leydesdorff,et al.  Turning the tables in citation analysis one more time: Principles for comparing sets of documents by using an “Integrated Impact Indicator” (I3) , 2011 .

[14]  Katy Börner,et al.  Models of Science Dynamics , 2012 .

[15]  Henk F. Moed,et al.  Opinion paper: thoughts and facts on bibliometric indicators , 2012, Scientometrics.

[16]  J. R. Wallis,et al.  Regional Frequency Analysis: An Approach Based on L-Moments , 1997 .

[17]  Wolfgang Glänzel,et al.  Characteristic scores and scales: A bibliometric analysis of subject characteristics based on long-term citation observation , 2007, J. Informetrics.

[18]  Hamid Bouabid,et al.  Revisiting citation aging: a model for citation distribution and life-cycle prediction , 2011, Scientometrics.

[19]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[20]  Joanna Mitro General theory of markov processes , 1991 .

[21]  William Shockley,et al.  On the Statistics of Individual Variations of Productivity in Research Laboratories , 1957, Proceedings of the IRE.

[22]  Michal Brzezinski,et al.  Power laws in citation distributions: evidence from Scopus , 2014, Scientometrics.

[23]  Aggelos Bletsas,et al.  Hirsch index rankings require scaling and higher moment , 2009 .

[24]  Per Ottar Seglen,et al.  The skewness of science , 1992 .

[25]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .

[26]  J. Davies The individual success of musicians, like that of physicists, follows a stretched exponential distribution , 2002 .

[28]  Ding-wei Huang,et al.  Dynamics of citation distribution , 2011, Comput. Phys. Commun..

[29]  Quentin L. Burrell The individual author’s publication–citation process: theory and practice , 2013, Scientometrics.

[30]  Santo Fortunato,et al.  Characterizing and Modeling Citation Dynamics , 2011, PloS one.

[31]  Ludo Waltman,et al.  A systematic empirical comparison of different approaches for normalizing citation impact indicators , 2013, J. Informetrics.

[32]  Peter Nijkamp,et al.  Accessibility of Cities in the Digital Economy , 2004, cond-mat/0412004.

[33]  Quentin L. Burrell,et al.  The nth-citation distribution and obsolescence , 2002, Scientometrics.

[34]  J. Hosking On the characterization of distributions by their L-moments , 2006 .

[35]  S. Kotz,et al.  Parameter estimation of the generalized Pareto distribution—Part II , 2010 .

[36]  Vincent Larivière,et al.  Modeling a century of citation distributions , 2008, J. Informetrics.

[37]  S. D. Haitun,et al.  Stationary scientometric distributions , 1982, Scientometrics.

[38]  W. Asquith Distributional Analysis with L-moment Statistics using the R Environment for Statistical Computing , 2011 .

[39]  I. I. Gikhman,et al.  The Theory of Stochastic Processes II , 1975 .

[40]  George A. Griffiths,et al.  A theoretically based Wakeby distribution for annual flood series , 1989 .

[41]  Loet Leydesdorff,et al.  How can journal impact factors be normalized across fields of science? An assessment in terms of percentile ranks and fractional counts , 2012, J. Assoc. Inf. Sci. Technol..

[42]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[43]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[44]  Anthony F. J. van Raan,et al.  Universality of citation distributions revisited , 2011, J. Assoc. Inf. Sci. Technol..

[45]  J. Hosking L‐Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics , 1990 .

[46]  S. N. Dorogovtsev,et al.  Structure of growing networks with preferential linking. , 2000, Physical review letters.

[47]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[48]  Manfred Gilli,et al.  Understanding complex systems , 1981, Autom..

[49]  Panos M. Pardalos,et al.  Handbook of Optimization in Complex Networks , 2012 .

[50]  I. I. Gikhman,et al.  The Theory of Stochastic Processes III , 1979 .

[51]  Alan Singleton,et al.  Bibliometrics and Citation Analysis; from the Science Citation Index to Cybermetrics , 2010, Learn. Publ..

[52]  Claudio Castellano,et al.  A Reverse Engineering Approach to the Suppression of Citation Biases Reveals Universal Properties of Citation Distributions , 2012, PloS one.

[53]  Henk F. Moed,et al.  Citation Analysis in Research Evaluation , 1899 .

[54]  Aristoklis D. Anastasiadis,et al.  Tsallis q-exponential describes the distribution of scientific citations—a new characterization of the impact , 2008, Scientometrics.

[55]  S. Redner Citation statistics from 110 years of physical review , 2005, physics/0506056.

[56]  Steve Pressé,et al.  Nonuniversal power law scaling in the probability distribution of scientific citations , 2010, Proceedings of the National Academy of Sciences.

[57]  Leo Egghe,et al.  Theory and practice of the shifted Lotka function , 2012, Scientometrics.

[58]  L. Rogers GENERAL THEORY OF MARKOV PROCESSES , 1989 .

[59]  K. Lau,et al.  INTEGRATED CAUCHY FUNCTIONAL EQUATION WITH AN ERROR TERM AND THE EXPONENTIAL LAW , 2008 .

[60]  S. D. Haitun,et al.  Stationary scientometric distributions , 1982, Scientometrics.

[61]  Pedro Albarrán,et al.  The skewness of science in 219 sub-fields and a number of aggregates , 2010, Scientometrics.

[62]  R. Perline Strong, Weak and False Inverse Power Laws , 2005 .

[63]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[64]  Yogesh Virkar,et al.  Power-law distributions in binned empirical data , 2012, 1208.3524.

[65]  S. Redner,et al.  Connectivity of growing random networks. , 2000, Physical review letters.

[66]  A. V. Skorohod,et al.  The theory of stochastic processes , 1974 .

[67]  Claudio Castellano,et al.  Rescaling citations of publications in Physics , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[68]  M. V. Simkin,et al.  Theory of citing , 2011, 1109.2272.

[69]  Jeppe Nicolaisen,et al.  Bibliometrics and Citation Analysis: From the Science Citation Index to Cybermetrics , 2009, J. Assoc. Inf. Sci. Technol..

[70]  Keshra Sangwal,et al.  Comparison of different mathematical functions for the analysis of citation distribution of papers of individual authors , 2013, J. Informetrics.

[71]  Per O. Seglen,et al.  The Skewness of Science , 1992, J. Am. Soc. Inf. Sci..

[72]  C. R. Rao,et al.  Integrated Cauchy functional equation and characterizations of the exponential law , 1982 .

[73]  Michael Golosovsky,et al.  Runaway events dominate the heavy tail of citation distributions , 2012, ArXiv.

[74]  N. Laws,et al.  A general theory of rods , 1966, Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences.

[75]  José R. Campanha,et al.  Power-law distributions for the citation index of scientific publications and scientists , 2005 .

[76]  Marta Sales-Pardo,et al.  Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal , 2010, J. Assoc. Inf. Sci. Technol..

[77]  W. Neumann Walter de Gruyter Berlin-New York , 1982 .

[78]  J. C. Houghton Birth of a parent: The Wakeby Distribution for modeling flood flows , 1978 .

[79]  P. Bourdieu The specificity of the scientific field and the social conditions of the progress of reason , 1975 .

[80]  Michael Golosovsky,et al.  The Transition Towards Immortality: Non-linear Autocatalytic Growth of Citations to Scientific Papers , 2013, ArXiv.

[81]  A. I. Yablonsky,et al.  Stable non-Gaussian distributions in scientometrics , 1985, Scientometrics.

[82]  R. Shimizu Functional equation with an error term and the stability of some characterizations of the exponential distribution , 1980 .

[83]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[84]  Aggelos Bletsas,et al.  Hirsch index rankings require scaling and higher moment , 2009, J. Assoc. Inf. Sci. Technol..

[85]  C. Tsallis,et al.  Are citations of scientific papers a case of nonextensivity? , 1999, cond-mat/9903433.

[86]  Fabian J. Theis,et al.  Handbook of Optimization in Complex Networks , 2011 .