Predicting the Reproducibility of Social and Behavioral Science Papers Using Supervised Learning Models

In recent years, significant effort has been invested verifying the reproducibility and robustness of research claims in social and behavioral sciences (SBS), much of which has involved resourceintensive replication projects. In this paper, we investigate prediction of the reproducibility of SBS papers using machine learning methods based on a set of features. We propose a framework that extracts five types of features from scholarly work that can be used to support assessments of reproducibility of published research claims. Bibliometric features, venue features, and author features are collected from public APIs or extracted using open source machine learning libraries with customized parsers. Statistical features, such as p-values, are extracted by recognizing patterns ar X iv :2 10 4. 04 58 0v 2 [ cs .D L ] 2 1 O ct 2 02 1 A PREPRINT OCTOBER 22, 2021 in the body text. Semantic features, such as funding information, are obtained from public APIs or are extracted using natural language processing models. We analyze pairwise correlations between individual features and their importance for predicting a set of human-assessed ground truth labels. In doing so, we identify a subset of 9 top features that play relatively more important roles in predicting the reproducibility of SBS papers in our corpus. Results are verified by comparing performances of 10 supervised predictive classifiers trained on different sets of features.

[1]  Cornelia Caragea,et al.  PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search , 2015, K-CAP.

[2]  Adam D. Siegel,et al.  Using prediction markets to predict the outcomes in the Defense Advanced Research Projects Agency's next-generation social science programme , 2021, Royal Society Open Science.

[3]  Philipp Mayr,et al.  Tracking self-citations in academic publishing , 2020, Scientometrics.

[4]  Dominika Tkaczyk,et al.  CERMINE: automatic extraction of structured metadata from scientific literature , 2015, International Journal on Document Analysis and Recognition (IJDAR).

[5]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[6]  Jaime A. Teixeira da Silva,et al.  CiteScore: Advances, Evolution, Applications, and Limitations , 2020, Publishing Research Quarterly.

[7]  Paul Wouters,et al.  Citations, Citation Indicators, and Research Quality: An Overview of Basic Concepts and Theories , 2019, SAGE Open.

[8]  Brian C. Ross Mutual Information between Discrete and Continuous Data Sets , 2014, PloS one.

[9]  Christopher D. Chambers,et al.  Transparency and Openness Promotion (TOP) Guidelines , 2014 .

[10]  Daniel Jurafsky,et al.  Measuring the Evolution of a Scientific Field through Citation Frames , 2018, TACL.

[11]  Brian J. Smith,et al.  Feature Engineering and Selection: A Practical Approach for Predictive Models , 2020 .

[12]  Analytic-thinking predicts hoax beliefs and helping behaviors in response to the COVID-19 pandemic , 2020 .

[13]  Brian A. Nosek,et al.  Using prediction markets to estimate the reproducibility of scientific research , 2015, Proceedings of the National Academy of Sciences.

[14]  Jian Wu,et al.  Acknowledgement Entity Recognition in CORD-19 Papers , 2020, SDP.

[15]  Brian Uzzi,et al.  Estimating the deep replicability of scientific findings using human and artificial intelligence , 2020, Proceedings of the National Academy of Sciences.

[16]  Paolo Malighetti,et al.  Self-citations as strategic response to the use of metrics for career decisions , 2017, Research Policy.

[17]  Brian A. Nosek,et al.  Predicting replication outcomes in the Many Labs 2 study , 2018, Journal of Economic Psychology.

[18]  Cornelia Caragea,et al.  Document Type Classification in Online Digital Libraries , 2016, AAAI.

[19]  Christopher Andreas Clark,et al.  PDFFigures 2.0: Mining figures from research papers , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[20]  Patrice Lopez,et al.  GROBID: Combining Automatic Bibliographic Data Recognition and Term Extraction for Scholarship Publications , 2009, ECDL.

[21]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[22]  F. Fidler,et al.  Mathematically aggregating experts’ predictions of possible futures , 2021, PloS one.

[23]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[24]  John P. A. Ioannidis,et al.  A manifesto for reproducible science , 2017, Nature Human Behaviour.

[25]  A. Kraskov,et al.  Estimating mutual information. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[27]  Michèle B. Nuijten,et al.  Replicability, Robustness, and Reproducibility in Psychological Science. , 2020, Annual review of psychology.

[28]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[29]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[30]  Oren Etzioni,et al.  Identifying Meaningful Citations , 2015, AAAI Workshop: Scholarly Big Data.

[31]  Brian A. Nosek,et al.  The preregistration revolution , 2018, Proceedings of the National Academy of Sciences.

[32]  Loet Leydesdorff,et al.  Scopus's Source Normalized Impact per Paper (SNIP) versus a Journal Impact Factor based on Fractional Counting of Citations , 2010, J. Assoc. Inf. Sci. Technol..

[33]  Jöran Beel,et al.  Evaluation of header metadata extraction approaches and tools for scientific PDF documents , 2013, JCDL '13.

[34]  Jöran Beel,et al.  Machine Learning vs. Rules and Out-of-the-Box vs. Retrained: An Evaluation of Open-Source Bibliographic Reference and Citation Parsers , 2018, JCDL.

[35]  Hannah Bast,et al.  A Benchmark and Evaluation for Text Extraction from PDF , 2017, 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[36]  Reginald B. Adams,et al.  Investigating Variation in Replicability: A “Many Labs” Replication Project , 2014 .

[37]  Reginald B. Adams,et al.  Many Labs 2: Investigating Variation in Replicability Across Sample and Setting , 2018 .

[38]  Brian A. Nosek,et al.  Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015 , 2018, Nature Human Behaviour.

[39]  John P. A. Ioannidis,et al.  What does research reproducibility mean? , 2016, Science Translational Medicine.