Rapid prediction of NMR spectral properties with quantified uncertainty

Accurate calculation of specific spectral properties for NMR is an important step for molecular structure elucidation. Here we report the development of a novel machine learning technique for accurately predicting chemical shifts of both $${^1\mathrm{H}}$$1H   and $${^{13}\mathrm{C}}$$13C nuclei which exceeds DFT-accessible accuracy for $${^{13}\mathrm{C}}$$13C and $${^1\mathrm{H}}$$1H for a subset of nuclei, while being orders of magnitude more performant. Our method produces estimates of uncertainty, allowing for robust and confident predictions, and suggests future avenues for improved performance.

[1]  Christoph Steinbeck,et al.  Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction , 2008, BMC Bioinformatics.

[2]  Yarin Gal,et al.  Dropout Inference in Bayesian Neural Networks with Alpha-divergences , 2017, ICML.

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  K. Zangger,et al.  Pure shift NMR. , 2015, Progress in nuclear magnetic resonance spectroscopy.

[5]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[6]  T. Guillot,et al.  SOPHIE velocimetry of Kepler transit candidates XVII. The physical properties of giant exoplanets within 400 days of period , 2015, 1511.00643.

[7]  T. Hoye,et al.  A guide to small-molecule structure assignment through computation of (1H and 13C) NMR chemical shifts , 2014, Nature Protocols.

[8]  Eugene E. Kwan,et al.  Enhancing NMR Prediction for Organic Compounds Using Molecular Dynamics. , 2015, Journal of chemical theory and computation.

[9]  Egon L. Willighagen,et al.  The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching , 2017, Journal of Cheminformatics.

[10]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[11]  W. Bremser Hose — a novel substructure code , 1978 .

[12]  Dit-Yan Yeung,et al.  Towards Bayesian Deep Learning: A Framework and Some Existing Methods , 2016, IEEE Transactions on Knowledge and Data Engineering.

[13]  Michael Gastegger,et al.  Machine learning molecular dynamics for the simulation of infrared spectra† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc02267k , 2017, Chemical science.

[14]  Stefan Kuhn,et al.  Facilitating quality control for spectra assignments of small organic molecules: nmrshiftdb2 – a free in‐house NMR database with integrated LIMS for academic service laboratories , 2015, Magnetic resonance in chemistry : MRC.

[15]  Michele Ceriotti,et al.  Chemical shifts in molecular solids by machine learning , 2018, Nature Communications.

[16]  Dit-Yan Yeung,et al.  Towards Bayesian Deep Learning: A Survey , 2016, ArXiv.

[17]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[18]  Dean J. Tantillo,et al.  Computational Prediction of 1H and 13C Chemical Shifts: A Useful Tool for Natural Product, Mechanistic, and Synthetic Organic Chemistry , 2012 .