Exploring and Understanding citation-Based Scientific Metrics

This paper explores citation-based metrics, how they differ in ranking papers and authors, and why. We initially take as example three main metrics that we believe significant; the standard citation count, the more and more popular h-index, and a variation we propose of PageRank applied to papers (called PaperRank), that is appealing as it mirrors proven and successful algorithms for ranking web pages. As part of analyzing them, we develop generally applicable techniques and metrics for qualitatively and quantitatively analyzing indexes that evaluate content and people, as well as for understanding the causes of their different behaviors. Finally, we extend the analysis to other popular indexes, to show whether the choice of the index has a significant effect in how papers and authors are ranked. We put the techniques at work on a dataset of over 260 K ACM papers, and discovered that the difference in ranking results is indeed very significant (even when restricting to citation-based indexes), with half of the top-ranked papers differing in a typical 20-element long search result page for papers on a given topic, and with the top researcher being ranked differently over half of the times in an average job posting with 100 applicants.

[1]  Henk F. Moed,et al.  Handbook of Quantitative Science and Technology Research , 2005 .

[2]  C. F. Kossack,et al.  Rank Correlation Methods , 1949 .

[3]  Marco Gori,et al.  Web page scoring systems for horizontal and vertical search , 2002, WWW.

[4]  C. Lee Giles,et al.  Popularity Weighted Ranking for Academic Digital Libraries , 2007, ECIR.

[5]  E. Garfield The history and meaning of the journal impact factor. , 2006, JAMA.

[6]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[7]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[8]  Franco Scarselli,et al.  Inside PageRank , 2005, TOIT.

[9]  A. Pritchard,et al.  Statistical bibliography or bibliometrics , 1969 .

[10]  Kurt Bryan,et al.  The $25,000,000,000 Eigenvector: The Linear Algebra behind Google , 2006, SIAM Rev..

[11]  Anthony F. J. van Raan,et al.  Monitoring Scientific Developments from a Dynamic Perspective: Self-Organized Structuring to Map Neural Network Research , 1998, Journal of the American Society for Information Science.

[12]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[13]  Gianna M. Del Corso,et al.  Fast PageRank Computation via a Sparse Linear System , 2005, Internet Math..

[14]  C. Cleverdon Citation Indexing , 1965, Nature.

[15]  W. Glänzel BIBLIOMETRICS AS A RESEARCH FIELD A course on theory and application of bibliometric indicators , 2003 .

[16]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[17]  D. Price Little Science, Big Science , 1965 .

[18]  B. Brookes,et al.  Bradford's Law and the Bibliography of Science , 1969, Nature.

[19]  Krishna Bharat,et al.  When experts agree: using non-affiliated experts to rank popular topics , 2002, ACM Trans. Inf. Syst..

[20]  M. Kendall,et al.  Rank Correlation Methods , 1949 .

[21]  Sergei Maslov,et al.  Finding scientific gems with Google's PageRank algorithm , 2006, J. Informetrics.

[22]  Henk F. Moed,et al.  Citation Analysis in Research Evaluation , 1899 .