Inorganic synthesis recommendation by machine learning materials similarity from scientific literature

Synthesis prediction is a key accelerator for the rapid design of advanced materials. However, determining synthesis variables such as the choice of precursor materials, operations, and conditions is challenging for inorganic materials because the sequence of reactions during heating is not well understood. In this work, we use a knowledge base of 29,900 solid-state synthesis recipes, text-mined from the scientific literature, to automatically learn which precursors to recommend for the synthesis of a novel target material. The data-driven approach learns chemical similarity of materials and refers the synthesis of a new target to precedent synthesis procedures of similar materials, mimicking human synthesis design. When proposing five precursor sets for each of 2,654 unseen test target materials, the recommendation strategy achieves a success rate of at least 82%. Our approach captures decades of heuristic synthesis data in a mathematical form, making it accessible for use in recommendation engines and autonomous laboratories.

[1]  Christopher J. Bartel,et al.  Machine-Learning Rationalization and Prediction of Solid-State Synthesis Conditions , 2022, Chemistry of materials : a publication of the American Chemical Society.

[2]  Gerbrand Ceder,et al.  Toward autonomous design and synthesis of novel inorganic materials. , 2021, Materials horizons.

[3]  Joseph H. Montoya,et al.  Rational Solid-State Synthesis Routes for Inorganic Materials. , 2021, Journal of the American Chemical Society.

[4]  Yong-nian Dai,et al.  Synthesis mechanism and characterization of LiMn0.5Fe0.5PO4/C composite cathode material for lithium-ion batteries , 2020 .

[5]  Christopher J. Bartel,et al.  Observing and Modeling the Sequential Pairwise Reactions that Drive Solid‐State Ceramic Synthesis , 2020, Advanced materials.

[6]  Z. Ye,et al.  Regulation of Fe3+-doped Sr4Al6SO16 crystalline structure , 2020 .

[7]  Kristin A. Persson,et al.  A graph-based network for predicting chemical reaction pathways in solid-state materials synthesis , 2020, Nature Communications.

[8]  G. Ceder,et al.  Similarity of Precursors in Solid-State Synthesis as Text-Mined from Scientific Literature , 2020, Chemistry of Materials.

[9]  G. Ceder,et al.  The interplay between thermodynamics and kinetics in the solid-state synthesis of layered oxides , 2020, Nature Materials.

[10]  Yichen Wei,et al.  Circle Loss: A Unified Perspective of Pair Similarity Optimization , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Steven K. Kauwe,et al.  Compositionally restricted attention-based network for materials property predictions , 2020, npj Computational Materials.

[12]  Xiandi Wang,et al.  Creating visible-to-near-infrared mechanoluminescence in mixed-anion compounds SrZn2S2O and SrZnSO , 2020 .

[13]  David J. Buttler,et al.  Nanomaterial Synthesis Insights from Machine Learning of Scientific Articles by Extracting, Structuring, and Visualizing Knowledge , 2019, J. Chem. Inf. Model..

[14]  H. Kohlmann Looking into the Black Box of Solid‐State Synthesis , 2019, European Journal of Inorganic Chemistry.

[15]  Rhys E. A. Goodall,et al.  Predicting materials properties without crystal structure: deep representation learning from stoichiometry , 2019, Nature Communications.

[16]  G. Ceder,et al.  Text-mined dataset of inorganic materials synthesis recipes , 2019, Scientific Data.

[17]  Sorelle A. Friedler,et al.  Anthropogenic biases in chemical reaction data hinder exploratory inorganic synthesis , 2019, Nature.

[18]  Guohua Chen,et al.  Synergistic effect of composite carbon source and simple pre-calcining process on significantly enhanced electrochemical performance of porous LiFe0.5Mn0.5PO4/C agglomerations , 2019, Electrochimica Acta.

[19]  S. Stefanovich,et al.  Barium-induced effects on structure and properties of β-Ca3(PO4)2-type Ca9Bi(VO4)7 , 2019, Journal of Alloys and Compounds.

[20]  Olga Kononova,et al.  Unsupervised word embeddings capture latent knowledge from materials science literature , 2019, Nature.

[21]  Andrew McCallum,et al.  Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks , 2018, J. Chem. Inf. Model..

[22]  T. McQueen,et al.  Progress toward Solid State Synthesis by Design. , 2018, Accounts of chemical research.

[23]  Kyle Chard,et al.  Matminer: An open source toolkit for materials data mining , 2018, Computational Materials Science.

[24]  J. Qiu,et al.  Transition Metal Doped Smart Glass with Pressure and Temperature Sensitive Luminescence , 2018, Advanced Optical Materials.

[25]  C. Masquelier,et al.  Coupled X-ray diffraction and electrochemical studies of the mixed Ti/V-containing NASICON: Na2TiV(PO4)3 , 2018 .

[26]  Mike Preuss,et al.  Planning chemical syntheses with deep neural networks and symbolic AI , 2017, Nature.

[27]  William H. Green,et al.  Computer-Assisted Retrosynthesis Based on Molecular Similarity , 2017, ACS central science.

[28]  G. Heymann,et al.  Li3Co1.06(1)TeO6: synthesis, single-crystal structure and physical properties of a new tellurate compound with CoII/CoIII mixed valence and orthogonally oriented Li-ion channels. , 2017, Dalton transactions.

[29]  Emma Strubell,et al.  Machine-learned and codified synthesis parameters of oxide materials , 2017, Scientific Data.

[30]  Callum Court,et al.  ChemDataExtractor: A toolkit for automated extraction of chemical information from the scientific literature , 2017 .

[31]  Jacqueline M. Cole,et al.  ChemDataExtractor: A Toolkit for Automated Extraction of Chemical Information from the Scientific Literature , 2016, J. Chem. Inf. Model..

[32]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[33]  Logan T. Ward,et al.  A General-Purpose Machine Learning Framework for Predicting Properties of Inorganic Materials , 2016, 1606.09551.

[34]  Mark A. Ratner,et al.  Challenges at the Frontiers of Matter and Energy: Transformative Opportunities for Discovery Science , 2015 .

[35]  Xinping Ai,et al.  Hierarchical Carbon Framework Wrapped Na3V2(PO4)3 as a Superior High‐Rate and Extended Lifespan Cathode for Sodium‐Ion Batteries , 2015, Advanced materials.

[36]  G. Deressa,et al.  Strong blue absorption of green Zn2SiO4:Mn2+ phosphor by doping heavy Mn2+ concentrations , 2015 .

[37]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[38]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[39]  J. Lin,et al.  A new layered triangular antiferromagnet Li4FeSbO6: spin order, field-induced transitions and anomalous critical behavior. , 2013, Dalton transactions.

[40]  Yongyao Xia,et al.  Preparation of carbon-coated LiFe0.2Mn0.8PO4 cathode material and its application in a novel battery with Li4Ti5O12 anode , 2012 .

[41]  A. F. Fuentes,et al.  Molten salts synthesis and electrical properties of Sr- and/or Mg-doped perovskite-type LaAlO3 powders , 2012, Journal of Materials Science.

[42]  Zhiyong Mao,et al.  Investigation of 515 nm green-light emission for full color emission LaAlO3 phosphor with varied valence Eu , 2011 .

[43]  Bin Yang,et al.  Optimized electrochemical performance of LiMn0.9Fe0.1−xMgxPO4/C for lithium ion batteries , 2011 .

[44]  M. Johnsson,et al.  Synthesis, crystal structure, and magnetic properties of the copper selenite chloride Cu5(SeO3)4Cl2. , 2010, Inorganic chemistry.

[45]  David Rogers,et al.  Extended-Connectivity Fingerprints , 2010, J. Chem. Inf. Model..

[46]  Wenyu Li,et al.  Encapsulation of strontium aluminate phosphors to enhance water resistance and luminescence , 2009 .

[47]  Htjm Bert Hintzen,et al.  Photoluminescence Properties of Novel Red-Emitting Mn2+-Activated MZnOS (M = Ca, Ba) Phosphors , 2009 .

[48]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[49]  J. Jumas,et al.  X-ray diffraction, 57Fe Mössbauer and step potential electrochemical spectroscopy study of LiFeyCo1−yO2 compounds , 1999 .

[50]  T. Mallouk,et al.  Turning Down the Heat: Design and Mechanism in Solid-State Synthesis , 1993, Science.

[51]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[52]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[53]  Xiangfeng Luan,et al.  Synthesis and Ion Conductivity of Li7La3Nb2O13Ceramics with Cubic Garnet-Type Structure , 2017 .

[54]  I. V. Budkin,et al.  Interplay of rare-earth and transition-metal subsystems in Cu 3 Yb(SeO 3 ) 2 O 2 Cl , 2017 .

[55]  Z. Zou,et al.  Preparation and property characterization of new Y2FeSbO7 and In2FeSbO7 photocatalysts , 2011 .

[56]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[57]  J. S. Evans,et al.  Synthesis and structure of Bi3Ca9V11O41 , 2000 .

[58]  C Uber,et al.  The present position , 1991 .

[59]  E. Corey,et al.  Robert Robinson Lecture. Retrosynthetic thinking—essentials and examples , 1988 .

[60]  H. Schäfer Preparative Solid State Chemistry: The Present Position , 1971 .