Field- and time-normalization of data with many zeros: an empirical analysis using citation and Twitter data

Thelwall (J Informetr 11(1):128–151, 2017a. https://doi.org/10.1016/j.joi.2016.12.002; Web indicators for research evaluation: a practical guide. Morgan and Claypool, London, 2017b) proposed a new family of field- and time-normalized indicators, which is intended for sparse data. These indicators are based on units of analysis (e.g., institutions) rather than on the paper level. They compare the proportion of mentioned papers (e.g., on Twitter) of a unit with the proportion of mentioned papers in the corresponding fields and publication years. We propose a new indicator (Mantel–Haenszel quotient, MHq) for the indicator family. The MHq is rooted in the Mantel–Haenszel (MH) analysis. This analysis is an established method, which can be used to pool the data from several 2 × 2 cross tables based on different subgroups. We investigate using citations and assessments by peers whether the indicator family can distinguish between quality levels defined by the assessments of peers. Thus, we test the convergent validity. We find that the MHq is able to distinguish between quality levels in most cases while other indicators of the family are not. Since our study approves the MHq as a convergent valid indicator, we apply the MHq to four different Twitter groups as defined by the company Altmetric. Our results show that there is a weak relationship between the Twitter counts of all four Twitter groups and scientific quality, much weaker than between citations and scientific quality. Therefore, our results discourage the use of Twitter counts in research evaluation.

[1]  J. Ruiz-Castillo,et al.  “ Context Counts : Pathways to Master Big and Little Data ” , 2014 .

[2]  Vincent Larivière,et al.  Tweets vs. Mendeley readers: How do these two social media metrics differ? , 2014, it Inf. Technol..

[3]  David J. Sheskin,et al.  Handbook of Parametric and Nonparametric Statistical Procedures , 1997 .

[4]  J. Fleiss Statistical methods for rates and proportions , 1974 .

[5]  J. A. Nelder,et al.  The Analysis of Categorical Data. , 1975 .

[6]  Tara J Brigham An Introduction to Altmetrics , 2014, Medical reference services quarterly.

[7]  Sam Work Social Media in Scholarly Communication. A Review of the Literature and Empirical Analysis of Twitter Use by SSHRC Doctoral Award Recipients , 2015 .

[8]  Yin Leng Theng,et al.  Altmetrics: an analysis of the state-of-the-art in measuring research impact on social media , 2016, Scientometrics.

[9]  Lutz Bornmann,et al.  Proposal of a minimum constraint for indicators based on means or averages , 2016, J. Informetrics.

[10]  Nadine Rons,et al.  Partition-based Field Normalization: An approach to highly specialized publication records , 2013, J. Informetrics.

[11]  Lutz Bornmann,et al.  Measuring field-normalized impact of papers on specific societal groups: An altmetrics study based on Mendeley data , 2016, ArXiv.

[12]  Andrea Bergmann,et al.  Citation Indexing Its Theory And Application In Science Technology And Humanities , 2016 .

[13]  B. Bailey,et al.  Confidence limits to the risk ratio. , 1987, Biometrics.

[14]  M. Eysenck,et al.  The correlation between RAE ratings and citation counts in psychology Technical Report , 2002 .

[15]  Philip H. Ramsey Nonparametric Statistical Methods , 1974, Technometrics.

[16]  Massimo Franceschet,et al.  The first Italian research assessment exercise: A bibliometric perspective , 2009, J. Informetrics.

[17]  Lutz Bornmann,et al.  Sampling issues in bibliometric analysis: Response to discussants , 2016, J. Informetrics.

[18]  Nadine Rons,et al.  Investigation of Partition Cells as a Structural Basis Suitable for Assessments of Individual Scientists , 2014, ArXiv.

[19]  Lutz Bornmann,et al.  Interrater reliability and convergent validity of F1000Prime peer review , 2014, J. Assoc. Inf. Sci. Technol..

[20]  Lutz Bornmann,et al.  Scientific peer review , 2011, Annu. Rev. Inf. Sci. Technol..

[21]  Mike Thelwall,et al.  National research impact indicators from Mendeley readers , 2015, J. Informetrics.

[22]  Gabriel Kreiman,et al.  Nine Criteria for a Measure of Scientific Output , 2011, Front. Comput. Neurosci..

[23]  Michael Thelwall,et al.  Web Indicators for Research Evaluation: A Practical Guide , 2016, Synthesis Lectures on Information Concepts, Retrieval, and Services.

[24]  J. Fleiss,et al.  Statistical methods for rates and proportions , 1973 .

[25]  Lutz Bornmann,et al.  Normalization of Mendeley reader impact on the reader- and paper-side: A comparison of the mean discipline normalized reader score (MDNRS) with the mean normalized reader score (MNRS) and bare reader counts , 2016, J. Informetrics.

[26]  François Claveau There should not be any mystery: A comment on sampling issues in bibliometrics , 2016, J. Informetrics.

[27]  Eugene Garfield,et al.  Citation indexing - its theory and application in science, technology, and humanities , 1979 .

[28]  W. Haenszel,et al.  Statistical aspects of the analysis of data from retrospective studies of disease. , 1959, Journal of the National Cancer Institute.

[29]  Stephen McKay,et al.  Social Policy Excellence – Peer Review or Metrics? Analyzing the 2008 Research Assessment Exercise in Social Work and Social Policy and Administration , 2012 .

[30]  Bret Larget,et al.  Analysis of Categorical Data , 2002 .

[31]  Ludo Waltman,et al.  F1000 Recommendations as a Potential New Data Source for Research Evaluation: A Comparison With Citations , 2014, J. Assoc. Inf. Sci. Technol..

[32]  Wolfgang Glänzel,et al.  A bibliometric study on ageing and reception processes of scientific literature , 1995, J. Inf. Sci..

[33]  Lutz Bornmann,et al.  How to normalize Twitter counts? A first attempt based on journals in the Twitter Index , 2016, Scientometrics.

[34]  A. Neely,et al.  Citation Counts: Are They Good Predictors of Rae Scores? A Bibliometric Analysis of RAE 2001 , 2008 .

[35]  Lutz Bornmann,et al.  Validity of altmetrics data for measuring societal impact: A study using data from Altmetric and F1000Prime , 2014, J. Informetrics.

[36]  Andreas Diekmann,et al.  Die Rezeption (Thyssen-)preisgekrönter Artikel in der „Scientific Community“ , 2012 .

[37]  Lutz Bornmann,et al.  Normalization of zero-inflated data: An empirical analysis of a new indicator family , 2017, J. Informetrics.

[38]  V. Rich Personal communication , 1989, Nature.

[39]  L. Butler,et al.  Evaluating University Research Performance Using Metrics , 2011 .

[40]  Stefanie Haustein,et al.  Grand challenges in altmetrics: heterogeneity, data quality and dependencies , 2016, Scientometrics.

[41]  Mario Cortina-Borja,et al.  Handbook of Parametric and Nonparametric Statistical Procedures, 5th edn , 2012 .

[42]  Lutz Bornmann,et al.  Sampling issues in bibliometric analysis , 2014, J. Informetrics.

[43]  Mike Thelwall,et al.  Three practical field normalised alternative indicator formulae for research evaluation , 2016, J. Informetrics.

[44]  S. Radhakrishna,et al.  Combination of results from several 2 X 2 contingency tables , 1965 .

[45]  Lutz Bornmann,et al.  Normalization of Mendeley reader counts for impact assessment , 2016, J. Informetrics.

[46]  Anne E. Rauh,et al.  Introduction to Altmetrics for Science, Technology, Engineering, and Mathematics (STEM) Librarians , 2013 .

[47]  Thed N. van Leeuwen,et al.  Towards a new crown indicator: an empirical analysis , 2010, Scientometrics.

[48]  Mike Thelwall,et al.  Alternative metric indicators for funding scheme evaluations , 2016, Aslib J. Inf. Manag..

[49]  A. Remhof,et al.  Towards a new , 1997 .