Open Research Knowledge Graph: Next Generation Infrastructure for Semantic Scholarly Knowledge

Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. In this form, scholarly knowledge is hard to process automatically. We present the first steps towards a knowledge graph based infrastructure that acquires scholarly knowledge in machine actionable form thus enabling new possibilities for scholarly knowledge curation, publication and processing. The primary contribution is to present, evaluate and discuss multi-modal scholarly knowledge acquisition, combining crowdsourced and automated techniques. We present the results of the first user evaluation of the infrastructure with the participants of a recent international conference. Results suggest that users were intrigued by the novelty of the proposed infrastructure and by the possibilities for innovative scholarly knowledge processing it could enable.

[1]  Karen R McElfresh,et al.  Development of the research lifecycle model for library services. , 2013, Journal of the Medical Library Association : JMLA.

[2]  Barend Mons,et al.  Which gene did you mean? , 2005, BMC Bioinformatics.

[3]  Hugo Fjelsted Alrøe,et al.  Second-Order Science of Interdisciplinary Research: A Polyocular Framework for Wicked Problems , 2014 .

[4]  Jens Lehmann,et al.  Old is Gold: Linguistic Driven Approach for Entity and Relation Linking of Short Text , 2019, NAACL.

[5]  Siegfried Handschuh,et al.  SALT - Semantically Annotated LaTeX for scientific publications , 2007 .

[6]  Maria-Esther Vidal,et al.  Towards a Knowledge Graph for Science , 2018, WIMS.

[7]  Kyle Lo,et al.  SciBERT: Pretrained Contextualized Embeddings for Scientific Text , 2019, ArXiv.

[8]  Silvio Peroni,et al.  The Semantic Publishing and Referencing Ontologies , 2014 .

[9]  Angelo Di Iorio,et al.  Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles , 2017, PeerJ Comput. Sci..

[10]  Alexander Hars,et al.  Designing Scientific Knowledge Infrastructures: The Contribution of Epistemology , 2001, Inf. Syst. Frontiers.

[11]  Amir Aryani,et al.  Research Graph: Building a Distributed Graph of Scholarly Works using Research Data Switchboard , 2017 .

[12]  Vera G. Meister Towards a Knowledge Graph for a Research Group with Focus on Qualitative Analysis of Scholarly Papers , 2017, SemSci@ISWC.

[13]  Juan Enrique Ramos,et al.  Using TF-IDF to Determine Word Relevance in Document Queries , 2003 .

[14]  P. N. Edwards,et al.  Knowledge Infrastructures: Intellectual Frameworks and Research Challenges , 2013 .

[15]  Manuel Prinz,et al.  Towards Research Infrastructures that Curate Scientific Information: A Use Case in Life Sciences , 2018, DILS.

[16]  H. Jansen,et al.  The Logic of Qualitative Survey Research and its Position in the Field of Social Research Methods , 2010 .

[17]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[18]  Robert B. Allen,et al.  Supporting Structured Browsing for Full-Text Scientific Research Reports , 2012, ArXiv.

[19]  Robert B. Allen,et al.  Rich Semantic Models and Knowledgebases for Highly-Structured Scientific Communication , 2017, ArXiv.

[20]  Paolo Manghi,et al.  The Scholix Framework for Interoperability in Data-Literature Information Exchange , 2017, D Lib Mag..

[21]  Paul Donohoe,et al.  The Long Road to JATS , 2015 .

[22]  Ruben Verborgh,et al.  Decentralised Authoring, Annotations and Notifications for a Read-Write Web with dokieli , 2017, ICWE.

[23]  Lutz Bornmann,et al.  Growth rates of modern science: A bibliometric analysis based on the number of publications and cited references , 2014, J. Assoc. Inf. Sci. Technol..

[24]  Tim DiLauro,et al.  The RMap Project: Capturing and Preserving Associations amongst Multi-Part Distributed Publications , 2015, JCDL.

[25]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[26]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[27]  Jens Lehmann,et al.  EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs , 2018, SEMWEB.

[28]  Paul T. Groth,et al.  The anatomy of a nanopublication , 2010, Inf. Serv. Use.

[29]  Christoph Lange,et al.  Towards a Knowledge Graph Representing Research Findings by Semantifying Survey Articles , 2017, TPDL.

[30]  Leen Breure,et al.  Modeling Rhetoric in Scientific Publications , 2008 .

[31]  Doug Downey,et al.  Construction of the Literature Graph in Semantic Scholar , 2018, NAACL.

[32]  Christoph Lange,et al.  Ontologies and languages for representing mathematical knowledge on the Semantic Web , 2013, Semantic Web.

[33]  Herbert Van de Sompel,et al.  All aboard: toward a machine-friendly scholarly communication system , 2009, The Fourth Paradigm.

[34]  Boyan Brodaric,et al.  SKIing with DOLCE: toward an e-Science Knowledge Infrastructure , 2008, FOIS.

[35]  Bianca Kramer,et al.  The Scholarly Commons - principles and practices to guide research communication , 2017 .

[36]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[37]  Paolo Ferragina,et al.  TAGME: on-the-fly annotation of short text fragments (by wikipedia entities) , 2010, CIKM.

[38]  Roberto Navigli,et al.  Entity Linking meets Word Sense Disambiguation: a Unified Approach , 2014, TACL.

[39]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[40]  Maria-Esther Vidal,et al.  Integration of Scholarly Communication Metadata Using Knowledge Graphs , 2017, TPDL.

[41]  Krzysztof Janowicz,et al.  The GeoLink knowledge graph , 2018 .

[42]  Jens Lehmann,et al.  Why Reinvent the Wheel: Let's Build Question Answering Systems Together , 2018, WWW.

[43]  Carole A. Goble,et al.  Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications , 2013, Journal of Biomedical Semantics.

[44]  Sean Bechhofer,et al.  Research Objects: Towards Exchange and Reuse of Digital Knowledge , 2010 .