Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal

A central issue in evaluative bibliometrics is the characterization of the citation distribution of papers in the scientific literature. Here, we perform a large-scale empirical analysis of journals from every field in Thomson Reuters' Web of Science database. We find that only 30 of the 2,184 journals have citation distributions that are inconsistent with a discrete lognormal distribution at the rejection threshold that controls the false discovery rate at 0.05. We find that large, multidisciplinary journals are over-represented in this set of 30 journals, leading us to conclude that, within a discipline, citation distributions are lognormal. Our results strongly suggest that the discrete lognormal distribution is a globally accurate model for the distribution of “eventual impact” of scientific papers published in single-discipline journal in a single year that is removed sufficiently from the present date. © 2010 Wiley Periodicals, Inc.

[1]  Alfred J. Lotka,et al.  The frequency distribution of scientific productivity , 1926 .

[2]  J. Aitchison On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin , 1955 .

[3]  William Shockley,et al.  On the Statistics of Individual Variations of Productivity in Research Laboratories , 1957, Proceedings of the IRE.

[4]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[5]  J. Taylor An Introduction to Error Analysis , 1982 .

[6]  John A. Stewart,et al.  The Poisson-Lognormal Model for Bibliometric/Scientometric Distributions , 1994, Inf. Process. Manag..

[7]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[8]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[9]  S. Redner How popular is your paper? An empirical study of the citation distribution , 1998, cond-mat/9804163.

[10]  P. Wouters The citation culture , 1999 .

[11]  Leo Egghe,et al.  Aging, obsolescence, impact, growth, and utilization: definitions and relations , 2000 .

[12]  Quentin L. Burrel Stochastic modelling of the first-citation distribution , 2001 .

[13]  S. Redner,et al.  Organization of growing random networks. , 2000, Physical review. E, Statistical, nonlinear, and soft matter physics.

[14]  David Adam,et al.  Citation analysis: The counting house , 2002, Nature.

[15]  Quentin L. Burrell,et al.  Predicting future citation behavior , 2003, J. Assoc. Inf. Sci. Technol..

[16]  Katy Börner,et al.  Mapping knowledge domains , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[17]  E. Garfield,et al.  The myth of delayed recognition , 2004 .

[18]  Kevin W. Boyack,et al.  Mapping the backbone of science , 2004, Scientometrics.

[19]  S. Redner Citation statistics from 110 years of physical review , 2005, physics/0506056.

[20]  R. Perline Strong, Weak and False Inverse Power Laws , 2005 .

[21]  Sarah Tomlin Science in the web age: The expanding electronic universe , 2005, Nature.

[22]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[23]  M. V. Simkin,et al.  A mathematical theory of citing , 2007 .

[24]  Samuel Kotz,et al.  Models for citation behavior , 2007, Scientometrics.

[25]  Lutz Bornmann,et al.  Selecting manuscripts for a high-impact journal through peer review: A citation analysis of communications that were accepted by Angewandte Chemie International Edition, or rejected but published elsewhere , 2008 .

[26]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[27]  M. Sales-Pardo,et al.  Effectiveness of Journal Ranking Schemes as a Tool for Locating Information , 2008, PloS one.

[28]  J. Lane Assessing the Impact of Science Funding , 2009, Science.