Three-feature model to reproduce the topology of citation networks and the effects from authors' visibility on their h-index

Various factors are believed to govern the selection of references in citation networks, but a precise, quantitative determination of their importance has remained elusive. In this paper, we show that three factors can account for the referencing pattern of citation networks for two topics, namely “graphenes” and “complex networks”, thus allowing one to reproduce the topological features of the networks built with papers being the nodes and the edges established by citations. The most relevant factor was content similarity, while the other two – in-degree (i.e. citation counts) and age of publication – had varying importance depending on the topic studied. This dependence indicates that additional factors could play a role. Indeed, by intuition one should expect the reputation (or visibility) of authors and/or institutions to affect the referencing pattern, and this is only indirectly considered via the in-degree that should correlate with such reputation. Because information on reputation is not readily available, we simulated its effect on artificial citation networks considering two communities with distinct fitness (visibility) parameters. One community was assumed to have twice the fitness value of the other, which amounts to a double probability for a paper being cited. While the h-index for authors in the community with larger fitness evolved with time with slightly higher values than for the control network (no fitness considered), a drastic effect was noted for the community with smaller fitness.

[1]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[2]  Philip Ball,et al.  Index aims for fair ranking of scientists , 2005, Nature.

[3]  Rodrigo Costas,et al.  The h-index: Advantages, limitations and its relation with other bibliometric indicators at the micro level , 2007, J. Informetrics.

[4]  Xianmin Geng,et al.  Degree correlations in citation networks model with aging , 2009 .

[5]  Kamalika Basu Hajra,et al.  Aging in citation networks , 2004, cond-mat/0409017.

[6]  Lutz Bornmann,et al.  What factors determine citation counts of publications in chemistry besides their quality? , 2012, J. Informetrics.

[7]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[8]  M. E. J. Newman,et al.  Power laws, Pareto distributions and Zipf's law , 2005 .

[9]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[10]  William H. Press,et al.  Numerical recipes: the art of scientific computing, 3rd Edition , 2007 .

[11]  Luciano da Fontoura Costa,et al.  Using complex networks concepts to assess approaches for citations in scientific papers , 2012, Scientometrics.

[12]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[13]  Filippo Menczer,et al.  Evolution of document networks , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Jiang Wu,et al.  Assessing impact and quality from local dynamics of citation networks , 2012, J. Informetrics.

[15]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[16]  Mark Newman,et al.  Networks: An Introduction , 2010 .

[17]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[18]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[19]  J. Hirsch An index to quantify an individual's scientific output , 2005 .

[20]  H. Bauke Parameter estimation for power-law distributions by maximum likelihood methods , 2007, 0704.1867.

[21]  Mary Shultz,et al.  Online journals' impact on the citation patterns of medical faculty. , 2005, Journal of the Medical Library Association : JMLA.

[22]  M. Newman Power laws, Pareto distributions and Zipf's law , 2005 .

[23]  César A. Hidalgo,et al.  Scale-free networks , 2008, Scholarpedia.

[24]  Petter Holme,et al.  Modeling scientific-citation patterns and other triangle-rich acyclic networks , 2009, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Francisco Herrera,et al.  h-Index: A review focused in its variants, computation and standardization for different scientific fields , 2009, J. Informetrics.

[26]  L. Egghe Power Laws in the Information Production Process: Lotkaian Informetrics , 2005 .

[27]  W. Press,et al.  Numerical Recipes: The Art of Scientific Computing , 1987 .