On the relationship between download and citation counts: An introduction of Granger-causality inference

Abstract Studies on the relationship between the numbers of citations and downloads of scientific publications is beneficial for understanding the mechanism of citation patterns and research evaluation. However, seldom studies have considered directionality issues between downloads and citations or adopted a case-by-case time lag length between the download and citation time series of each individual publication. In this paper, we introduce the Granger-causal inference strategy to study the directionality between downloads and citations and set up the length of time lag between the time series for each case. By researching the publications on the Lancet, we find that publications have various directionality patterns, but highly cited publications tend to feature greater possibilities to have Granger causality. We apply a step-by-step manner to introduce the Granger-causal inference method to information science as four steps, namely conducting stationarity tests, determining time lag between time series, establishing cointegration test, and implementing Granger-causality inference. We hope that this method can be applied by future information scientists in their own research contexts.

[1]  E. Garfield The history and meaning of the journal impact factor. , 2006, JAMA.

[2]  Claudio Castellano,et al.  Universality of citation distributions: Toward an objective measure of scientific impact , 2008, Proceedings of the National Academy of Sciences.

[3]  M. Amin,et al.  Impact factors: use and abuse. , 2003, Medicina.

[4]  Yong Huang,et al.  A multidimensional perspective on the citation impact of scientific publications , 2019, ISSI.

[5]  Helmut Ltkepohl,et al.  New Introduction to Multiple Time Series Analysis , 2007 .

[6]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[7]  Ludo Waltman,et al.  PageRank-Related Methods for Analyzing Citation Networks , 2014 .

[8]  M. Atkinson,et al.  Type 1 diabetes , 2014, The Lancet.

[9]  Lutz Bornmann,et al.  What do citation counts measure? A review of studies on citing behavior , 2008, J. Documentation.

[10]  Ling-Chu Lee,et al.  Research output and economic productivity: a Granger causality test , 2011, Scientometrics.

[11]  Anil K. Seth,et al.  The MVGC multivariate Granger causality toolbox: A new approach to Granger-causal inference , 2014, Journal of Neuroscience Methods.

[12]  C. Granger,et al.  Spurious regressions in econometrics , 1974 .

[13]  A. K. Giri,et al.  The impact of financial development, economic growth, income inequality on poverty: evidence from India , 2018 .

[14]  Nancy Fullman,et al.  Global malaria mortality between 1980 and 2010: a systematic analysis , 2012, The Lancet.

[15]  C. Calderon,et al.  The Direction of Causality Between Financial Development and Economic Growth , 2003 .

[16]  E. Garfield Citation analysis as a tool in journal evaluation. , 1972, Science.

[17]  Félix de Moya Anegón,et al.  Relationship between downloads and citations at journal and paper levels, and the influence of language , 2014, Scientometrics.

[18]  Karl J. Friston,et al.  Analysing connectivity with Granger causality and dynamic causal modelling , 2013, Current Opinion in Neurobiology.

[19]  Carl T. Bergstrom Eigenfactor Measuring the value and prestige of scholarly journals , 2007 .

[20]  Juan Gorraiz,et al.  Comparison of citation and usage indicators: the case of oncology journals , 2010, Scientometrics.

[21]  Xiaoling Xia,et al.  Inception-v3 for flower classification , 2017, 2017 2nd International Conference on Image, Vision and Computing (ICIVC).

[22]  Henk F. Moed,et al.  Statistical relationships between downloads and citations at the level of individual documents within a single journal , 2005, J. Assoc. Inf. Sci. Technol..

[23]  R. Horton,et al.  Bangladesh: innovating for health , 2013, The Lancet.

[24]  M. HamidR.Jamali,et al.  Article title type and its relation with the number of downloads and citations , 2011, Scientometrics.

[25]  Ying Ding,et al.  Understanding scientific collaboration: Homophily, transitivity, and preferential attachment , 2018, J. Assoc. Inf. Sci. Technol..

[26]  Roger E. A. Farmer,et al.  The Stock Market Crash Really Did Cause the Great Recession , 2013 .

[27]  Johan Bollen,et al.  Usage bibliometrics , 2011, Annu. Rev. Inf. Sci. Technol..

[28]  Edwin A. Henneken,et al.  Measuring metrics ‐ a 40‐year longitudinal cross‐validation of citations, downloads, and peer review in astrophysics , 2015, J. Assoc. Inf. Sci. Technol..

[29]  B. Wood Modeling Federal Implementation as a System: The Clean Air Case , 1992 .

[30]  Kendra Schwartz,et al.  Characterizing inflammatory breast cancer among Arab Americans in the California, Detroit and New Jersey Surveillance, Epidemiology and End Results (SEER) registries (1988–2008) , 2012, SpringerPlus.

[31]  W. Fuller,et al.  Distribution of the Estimators for Autoregressive Time Series with a Unit Root , 1979 .

[32]  S. Sereika,et al.  Vector Autoregressive Models and Granger Causality in Time Series Analysis in Nursing Research: Dynamic Changes Among Vital Signs Prior to Cardiorespiratory Instability Events as an Example , 2017, Nursing research.

[33]  Richard Schmalensee,et al.  Advertising and aggregate consumption: an analysis of causality , 1980 .

[34]  Henk F. Moed,et al.  On full text download and citation distributions in scientific‐scholarly journals , 2015, J. Assoc. Inf. Sci. Technol..

[35]  Juan Gorraiz,et al.  Usage versus citation behaviours in four subject areas , 2014, Scientometrics.

[36]  A. Watson Comparing citations and downloads for individual articles at the Journal of Vision , 2009 .

[37]  Laurie M Wilcox,et al.  A reevaluation of the tolerance to vertical misalignment in stereopsis. , 2009, Journal of vision.

[38]  John R. Freeman Granger Causality and the Time Series Analysis of Political Relationships , 1983 .

[39]  Umberto Triacca,et al.  Is Granger causality analysis appropriate to investigate the relationship between atmospheric concentration of carbon dioxide and global surface air temperature? , 2005 .

[40]  Carsten Nieder,et al.  Correlation between article download and citation figures for highly accessed articles from five open access oncology journals , 2013, SpringerPlus.

[41]  H. Appell Is the future of scientific journals electronic? some considerations about downloads and citations. , 2007, International journal of sports medicine.

[42]  Ahmad Sohrabian,et al.  Financial Markets, FDI, and Economic Growth: Granger Causality Tests in Panel Data Model , 2005 .

[43]  Giuseppe Lippi,et al.  Article downloads and citations: is there any relationship? , 2013, Clinica chimica acta; international journal of clinical chemistry.

[44]  Henk F. Moed,et al.  Statistical relationships between downloads and citations at the level of individual documents within a single journal: Book Reviews , 2005 .

[45]  A. Weersink,et al.  Causality between Dairy Farm Size and Productivity , 1991 .

[46]  C. Sims Martingale-Like Behavior of Prices , 1980 .

[47]  Johan Bollen,et al.  How the Scientific Community Reacts to Newly Submitted Preprints: Article Downloads, Twitter Mentions, and Citations , 2012, PloS one.

[48]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[49]  Rainer Goebel,et al.  Mapping directed influence over the brain using Granger causality and fMRI , 2005, NeuroImage.

[50]  C. Granger Investigating Causal Relations by Econometric Models and Cross-Spectral Methods , 1969 .

[51]  L. Egghe,et al.  Theory and practise of the g-index , 2006, Scientometrics.

[52]  Ludo Waltman,et al.  Field-Normalized Citation Impact Indicators and the Choice of an Appropriate Counting Method , 2015, ISSI.

[53]  S. Silber,et al.  Fertility preservation for age-related fertility decline , 2014, The Lancet.

[54]  A. Barabasi,et al.  Quantifying the evolution of individual scientific impact , 2016, Science.

[55]  Albert-László Barabási,et al.  Quantifying Long-Term Scientific Impact , 2013, Science.

[56]  P. Thejll,et al.  The cause‐and‐effect relationship of solar cycle length and the Northern Hemisphere air surface temperature , 2001 .

[57]  C. Granger,et al.  Co-integration and error correction: representation, estimation and testing , 1987 .

[58]  Mingzhou Ding,et al.  Analyzing multiple spike trains with nonparametric granger causality , 2009, Journal of Computational Neuroscience.

[59]  H. Akaike Statistical predictor identification , 1970 .

[60]  Kendall Powell,et al.  Does it take too long to publish research? , 2016, Nature.

[61]  Barbara McGillivray,et al.  The relationship between usage and citations in an open access mega-journal , 2019, Scientometrics.

[62]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[63]  K. Brownell,et al.  Strategic science with policy impact , 2015, The Lancet.

[64]  Nikolaos Dritsakis Tourism as a Long-Run Economic Growth Factor: An Empirical Investigation for Greece Using Causality Analysis , 2004 .

[65]  Lutz Bornmann,et al.  What do citation counts measure? An updated review of studies on citations in scientific documents published between 2006 and 2018 , 2019, Scientometrics.

[66]  M. Shabbir,et al.  An analysis of a causal relationship between economic growth and terrorism in Pakistan , 2013 .

[67]  H. Akaike A new look at the statistical model identification , 1974 .

[68]  Søren Johansen,et al.  Identifying restrictions of linear equations with applications to simultaneous equations and cointegration , 1995 .

[69]  Alper Aslan,et al.  Tourism development and economic growth in the Mediterranean countries: evidence from panel Granger causality tests , 2014 .

[70]  Mingzhou Ding,et al.  Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance , 2001, Biological Cybernetics.

[71]  P. Phillips,et al.  Asymptotic Properties of Residual Based Tests for Cointegration , 1990 .