Deep imitation learning for molecular inverse problems

Many measurement modalities arise from well-understood physical processes and result in information-rich but difficult-to-interpret data. Much of this data still requires laborious human interpretation. This is the case in nuclear magnetic resonance (NMR) spectroscopy, where the observed spectrum of a molecule provides a distinguishing fingerprint of its bond structure. Here we solve the resulting inverse problem: given a molecular formula and a spectrum, can we infer the chemical structure? We show for a wide variety of molecules we can quickly compute the correct molecular structure, and can detect with reasonable certainty when our method cannot. We treat this as a problem of graph-structured prediction, where armed with per-vertex information on a subset of the vertices, we infer the edges and edge types. We frame the problem as a Markov decision process (MDP) and incrementally construct molecules one bond at a time, training a deep neural network via imitation learning, where we learn to imitate a subisomorphic oracle which knows which remaining bonds are correct. Our method is fast, accurate, and is the first among recent chemical-graph generation approaches to exploit per-vertex information and generate graphs with vertex constraints. Our method points the way towards automation of molecular structure identification and potentially active learning for spectroscopy.

[1]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[2]  S. Kabanikhin Definitions and examples of inverse and ill-posed problems , 2008 .

[3]  Jure Leskovec,et al.  Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation , 2018, NeurIPS.

[4]  Nikos Komodakis,et al.  GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders , 2018, ICANN.

[5]  Razvan Pascanu,et al.  Learning Deep Generative Models of Graphs , 2018, ICLR 2018.

[6]  Michael Unser,et al.  Convolutional Neural Networks for Inverse Problems in Imaging: A Review , 2017, IEEE Signal Processing Magazine.

[7]  Alpár Jüttner,et al.  VF2++ - An improved subgraph isomorphism algorithm , 2018, Discret. Appl. Math..

[8]  Joshua Lederberg How DENDRAL was conceived and born , 1990 .

[9]  Dean J. Tantillo,et al.  Computational prediction of 1H and 13C chemical shifts: a useful tool for natural product, mechanistic, and synthetic organic chemistry. , 2012, Chemical reviews.

[10]  John Langford,et al.  A Credit Assignment Compiler for Joint Prediction , 2014, NIPS.

[11]  S. Kitamura,et al.  Neural network application to solve Fredholm integral equations of the first kind , 1989, International 1989 Joint Conference on Neural Networks.

[12]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[13]  B. Recht,et al.  3D imaging in volumetric scattering media using phase-space measurements. , 2015, Optics express.

[14]  John Langford,et al.  Search-based structured prediction , 2009, Machine Learning.

[15]  Junzhou Huang,et al.  Compressive Sensing MRI with Wavelet Tree Sparsity , 2012, NIPS.

[16]  John Langford,et al.  Learning to Search Better than Your Teacher , 2015, ICML.

[17]  Geoffrey J. Gordon,et al.  A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.

[18]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[19]  Jian Sun,et al.  Deep ADMM-Net for Compressive Sensing MRI , 2016, NIPS.

[20]  Shree K. Nayar,et al.  Ieee Transactions on Image Processing Computational Cameras: Convergence of Optics and Processing , 2022 .

[21]  L. Krivdin,et al.  Calculation of 15N NMR chemical shifts: Recent advances and perspectives. , 2017, Progress in nuclear magnetic resonance spectroscopy.

[22]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[23]  Christoph Steinbeck,et al.  Building blocks for automated elucidation of metabolites: Machine learning methods for NMR prediction , 2008, BMC Bioinformatics.

[24]  Rafael Brüschweiler,et al.  Improved Quantum Chemical NMR Chemical Shift Prediction of Metabolites in Aqueous Solution toward the Validation of Unknowns. , 2017, The journal of physical chemistry. A.

[25]  Regina Barzilay,et al.  Junction Tree Variational Autoencoder for Molecular Graph Generation , 2018, ICML.

[26]  Eric Jonas,et al.  Rapid prediction of NMR spectral properties with quantified uncertainty , 2019, Journal of Cheminformatics.

[27]  Nicola De Cao,et al.  MolGAN: An implicit generative model for small molecular graphs , 2018, ArXiv.

[28]  Yann LeCun,et al.  Learning Fast Approximations of Sparse Coding , 2010, ICML.