BackgroundOptimal ranking of literature importance is vital in overcoming article overload. Existing ranking methods are typically based on raw citation counts, giving a sum of ‘inbound’ links with no consideration of citation importance. PageRank, an algorithm originally developed for ranking webpages at the search engine, Google, could potentially be adapted to bibliometrics to quantify the relative importance weightings of a citation network. This article seeks to validate such an approach on the freely available, PubMed Central open access subset (PMC-OAS) of biomedical literature.ResultsOn-demand cloud computing infrastructure was used to extract a citation network from over 600,000 full-text PMC-OAS articles. PageRanks and citation counts were calculated for each node in this network. PageRank is highly correlated with citation count (R = 0.905, P < 0.01) and we thus validate the former as a surrogate of literature importance. Furthermore, the algorithm can be run in trivial time on cheap, commodity cluster hardware, lowering the barrier of entry for resource-limited open access organisations.ConclusionsPageRank can be trivially computed on commodity cluster hardware and is linearly correlated with citation count. Given its putative benefits in quantifying relative importance, we suggest it may enrich the citation network, thereby overcoming the existing inadequacy of citation counts alone. We thus suggest PageRank as a feasible supplement to, or replacement of, existing bibliometric ranking methods.
[1]
Kirby P. Lee,et al.
Association of journal quality indicators with methodological quality of clinical research articles.
,
2002,
JAMA.
[2]
David Adam,et al.
Citation analysis: The counting house
,
2002,
Nature.
[3]
J. Steehler.
Understanding Moore's Law—Four Decades of Innovation (David C. Brock, ed.)
,
2007
.
[4]
Anton Leykin,et al.
Parallel Homotopy Algorithms to Solve Polynomial Systems
,
2006,
ICMS.
[5]
Brian Hayes,et al.
Counting House
,
2019,
Chaucer.
[6]
Sanjay Ghemawat,et al.
MapReduce: Simplified Data Processing on Large Clusters
,
2004,
OSDI.
[7]
Rajeev Motwani,et al.
The PageRank Citation Ranking : Bringing Order to the Web
,
1999,
WWW 1999.
[8]
Jie Zou,et al.
Locating and parsing bibliographic references in HTML medical articles
,
2009,
International Journal on Document Analysis and Recognition (IJDAR).
[9]
Xiaoli Zhang,et al.
A structural SVM approach for reference parsing
,
2010,
2010 Ninth International Conference on Machine Learning and Applications.
[10]
Remedios Melero,et al.
Altmetrics – a complement to conventional metrics
,
2015,
Biochemia medica.
[11]
Cassidy R. Sugimoto,et al.
Do Altmetrics Work? Twitter and Ten Other Social Web Services
,
2013,
PloS one.