Assessing Scientific Research Papers with Knowledge Graphs

In recent decades, the growing scale of scientific research has led to numerous novel findings. Reproducing these findings is the foundation of future research. However, due to the complexity of experiments, manually assessing scientific research is laborious and time-intensive, especially in social and behavioral sciences. Although increasing reproducibility studies have garnered increased attention in the research community, there is still a lack of systematic ways for evaluating scientific research at scale. In this paper, we propose a novel approach towards automatically assessing scientific publications by constructing a knowledge graph (KG) that captures a holistic view of the research contributions. Specifically, during the KG construction, we combine information from two different perspectives: micro-level features that capture knowledge from published articles such as sample sizes, effect sizes, and experimental models, and macro-level features that comprise relationships between entities such as authorship and reference information. We then learn low-dimensional representations using language models and knowledge graph embeddings for entities (nodes in KGs), which are further used for the assessments. A comprehensive set of experiments on two benchmark datasets shows the usefulness of leveraging KGs for scoring scientific research.

[1]  Timothy M. Errington,et al.  Investigating the replicability of preclinical cancer biology , 2021, eLife.

[2]  David M. Pennock,et al.  Systematizing Confidence in Open Research and Evidence (SCORE) , 2021 .

[3]  Taylor Berg-Kirkpatrick,et al.  An Empirical Investigation of Contextualized Number Prediction , 2020, EMNLP.

[4]  Volker Tresp,et al.  PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings , 2020, J. Mach. Learn. Res..

[5]  Thomas Lukasiewicz,et al.  BoxE: A Box Embedding Model for Knowledge Base Completion , 2020, NeurIPS.

[6]  Brian Uzzi,et al.  Estimating the deep replicability of scientific findings using human and artificial intelligence , 2020, Proceedings of the National Academy of Sciences.

[7]  P. Talukdar,et al.  Composition-based Multi-Relational Graph Convolutional Networks , 2019, ICLR.

[8]  E. Machery What Is a Replication? , 2019, Philosophy of Science.

[9]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[10]  Colin Camerer,et al.  Predicting the replicability of social science lab experiments , 2019, PloS one.

[11]  Jens Lehmann,et al.  Incorporating Literals into Knowledge Graph Embeddings , 2018, SEMWEB.

[12]  Kan Chen,et al.  Knowledge Graph Representation with Jointly Structural and Textual Encoding , 2016, IJCAI.

[13]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[14]  Gideon Nave,et al.  Evaluating replicability of laboratory experiments in economics , 2016, Science.

[15]  Zhiyuan Liu,et al.  Representation Learning of Knowledge Graphs with Entity Descriptions , 2016, AAAI.

[16]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[17]  Jun Zhao,et al.  Knowledge Graph Embedding via Dynamic Mapping Matrix , 2015, ACL.

[18]  Zhiyuan Liu,et al.  Learning Entity and Relation Embeddings for Knowledge Graph Completion , 2015, AAAI.

[19]  Jianfeng Gao,et al.  Embedding Entities and Relations for Learning and Inference in Knowledge Bases , 2014, ICLR.

[20]  Jason Weston,et al.  Translating Embeddings for Modeling Multi-relational Data , 2013, NIPS.

[21]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[22]  F. Prinz,et al.  Believe it or not: how much can we rely on published data on potential drug targets? , 2011, Nature Reviews Drug Discovery.

[23]  Praveen Paritosh,et al.  Freebase: a collaboratively created graph database for structuring human knowledge , 2008, SIGMOD Conference.

[24]  J. Ioannidis Why Most Published Research Findings Are False , 2005, PLoS medicine.

[25]  Imre Lakatos,et al.  Criticism and the Growth of Knowledge , 1972 .

[26]  Imre Lakatos,et al.  Criticism and the Growth of Knowledge: Proceedings of the International Colloquium in the Philosophy of Science, London, 1965, Vol. 4 , 1970 .

[27]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[28]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[29]  Fabian M. Suchanek,et al.  YAGO3: A Knowledge Base from Multilingual Wikipedias , 2015, CIDR.

[30]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[31]  C. Glenn Begley,et al.  Raise standards for preclinical cancer research , 2012 .