Extraction and Evaluation of Statistical Information from Social and Behavioral Science Papers

With substantial and continuing increases in the number of published papers across the scientific literature, development of reliable approaches for automated discovery and assessment of published findings is increasingly urgent. Tools which can extract critical information from scientific papers and metadata can support representation and reasoning over existing findings, and offer insights into replicability, robustness and generalizability of specific claims. In this work, we present a pipeline for the extraction of statistical information (p-values, sample size, number of hypotheses tested) from full-text scientific documents. We validate our approach on 300 papers selected from the social and behavioral science literatures, and suggest directions for next steps.

[1]  Sander Greenland,et al.  Scientists rise up against statistical significance , 2019, Nature.

[2]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[3]  Waleed Ammar,et al.  Extracting Scientific Figures with Distantly Supervised Neural Networks , 2018, JCDL.

[4]  J. Faber,et al.  How sample size influences research outcomes , 2014, Dental press journal of orthodontics.

[5]  Tukur Dahiru,et al.  P – VALUE, A TRUE TEST OF STATISTICAL SIGNIFICANCE? A CAUTIONARY NOTE , 2008, Annals of Ibadan postgraduate medicine.

[6]  R. Lanfear,et al.  The Extent and Consequences of P-Hacking in Science , 2015, PLoS biology.

[7]  Michèle B. Nuijten,et al.  statcheck: Extract statistics from articles and recompute p values (R package version 1.0.0.) , 2014 .

[8]  J. Ioannidis The Proposal to Lower P Value Thresholds to .005. , 2018, JAMA.

[9]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[10]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[11]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[12]  Koen J. F. Verhoeven,et al.  Implementing false discovery rate control: increasing your power , 2005 .

[13]  Mykhailo Lobur,et al.  Using NLTK for educational and scientific purposes , 2011, 2011 11th International Conference The Experience of Designing and Application of CAD Systems in Microelectronics (CADSM).

[14]  Christopher Andreas Clark,et al.  PDFFigures 2.0: Mining figures from research papers , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[15]  Leif D. Nelson,et al.  Life after P-Hacking , 2013 .

[16]  Akiko Aizawa,et al.  Detecting In-line Mathematical Expressions in Scientific Documents , 2017, DocEng.

[17]  C. Bonferroni Il calcolo delle assicurazioni su gruppi di teste , 1935 .

[18]  Richard Zanibbi,et al.  ScanSSD: Scanning Single Shot Detector for Mathematical Formulas in PDF Document Images , 2020, ArXiv.