Knowledge Graph-Empowered Materials Discovery

In this position paper, we describe research on knowledge graph-empowered materials science prediction and discovery. The research consists of several key components including ontology mapping, materials data annotation, and information extraction from unstructured scholarly articles. We argue that although big data generated by simulations and experiments have motivated and accelerated the data-driven science, the distribution and heterogeneity of materials science-related big data hinders major advancements in the field. Knowledge graphs, as semantic hubs, integrate disparate data and provide a feasible solution to addressing this challenge. We design a knowledge-graph based approach for data discovery, extraction, and integration in materials science.

[1]  Xiaohua Hu,et al.  Text to Insight: Accelerating Organic Materials Knowledge Extraction via Deep Learning , 2021, ASIST.

[2]  Xintong Zhao,et al.  An exploratory analysis: extracting materials science knowledge from unstructured scholarly data , 2021, Electron. Libr..

[3]  R. Plante,et al.  A Controlled Vocabulary and Metadata Schema for Materials Science Data Discovery , 2021 .

[4]  Jane Greenberg,et al.  OTMapOnto: optimal transport-based ontology matching , 2021, OM@ISWC.

[5]  Jane Greenberg,et al.  HIVE-4-MAT: Advancing the Ontology Infrastructure for Materials Science , 2021, MTSR.

[6]  Jacqueline M. Cole,et al.  A database of battery materials auto-generated using ChemDataExtractor , 2020, Scientific Data.

[7]  R. Armiento,et al.  An Ontology for the Materials Design Domain , 2020, SEMWEB.

[8]  Anubhav Jain,et al.  propnet: A Knowledge Graph for Materials Science , 2020, Matter.

[9]  Ian T. Foster,et al.  Virtual Excited State Reference for the Discovery of Electronic Materials Database: An Open-Access Resource for Ground and Excited State Properties of Organic Molecules. , 2019, The journal of physical chemistry letters.

[10]  P. Rinke,et al.  Data‐Driven Materials Science: Status, Challenges, and Perspectives , 2019, Advanced science.

[11]  Olga Kononova,et al.  Distilling a Materials Synthesis Ontology , 2019, Matter.

[12]  Anubhav Jain,et al.  Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature , 2019, J. Chem. Inf. Model..

[13]  Elsa Olivetti,et al.  A Machine Learning Approach to Zeolite Synthesis Enabled by Automatic Literature Data Extraction , 2019, ACS central science.

[14]  Krishna Rajan,et al.  New frontiers for the materials genome initiative , 2019, npj Computational Materials.

[15]  Iz Beltagy,et al.  SciBERT: A Pretrained Language Model for Scientific Text , 2019, EMNLP.

[16]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[17]  Callum J Court,et al.  Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction , 2018, Scientific Data.

[18]  Christopher Wolverton,et al.  Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments , 2018, Science Advances.

[19]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[20]  A. McCallum,et al.  Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning , 2017 .

[21]  Logan T. Ward,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016, 1606.09551.

[22]  B. Meredig,et al.  Materials science with large-scale data and informatics: Unlocking new opportunities , 2016 .

[23]  A. Choudhary,et al.  Perspective: Materials informatics and big data: Realization of the “fourth paradigm” of science in materials science , 2016 .

[24]  Erik Schultes,et al.  The FAIR Guiding Principles for scientific data management and stewardship , 2016, Scientific Data.

[25]  Kristin A. Persson,et al.  Commentary: The Materials Project: A materials genome approach to accelerating materials innovation , 2013 .

[26]  Toshihiro Ashino,et al.  Materials Ontology: An Infrastructure for Exchanging Materials Information and Knowledge , 2010, Data Sci. J..

[27]  Tony Hey,et al.  The Fourth Paradigm: Data-Intensive Scientific Discovery , 2009 .

[28]  Steffen Staab,et al.  What Is an Ontology? , 2009, Handbook on Ontologies.

[29]  Michael Darsow,et al.  ChEBI: a database and ontology for chemical entities of biological interest , 2007, Nucleic Acids Res..

[30]  Wei Zhang,et al.  A novel Cu-based BMG composite with high corrosion resistance and excellent mechanical properties , 2006 .

[31]  John Mylopoulos,et al.  Constructing Complex Semantic Mappings Between XML Data and Ontologies , 2005, SEMWEB.

[32]  John Mylopoulos,et al.  Inferring Complex Semantic Mappings Between Relational Tables and Ontologies from Simple Correspondences , 2005, OTM Conferences.

[33]  Heiner Stuckenschmidt,et al.  Handbook on Ontologies , 2004, Künstliche Intell..