Decoupled coordinates for machine learning-based molecular fragment linking

Recent developments in machine-learning based molecular fragment linking have demonstrated the importance of informing the generation process with structural information specifying the relative orientation of the fragments to be linked. However, such structural information has not yet been provided in the form of a complete relative coordinate system. Mathematical details for a decoupled set of bond lengths, bond angles and torsion angles are elaborated and the coordinate system is demonstrated to be complete. Significant impact on the quality of the generated linkers is demonstrated numerically. The amount of reliable information within the different types of degrees of freedom is investigated. Ablation studies and an information-theoretical analysis are performed. The presented benefits suggest the application of a complete and decoupled relative coordinate system as a standard good practice in linker design.

[1]  Harold A. Scheraga,et al.  On the Use of Classical Statistical Mechanics in the Treatment of Polymer Chain Conformation , 1976 .

[2]  Bojan Zagrovic,et al.  PARENT: A Parallel Software Suite for the Calculation of Configurational Entropy in Biomolecular Systems. , 2016, Journal of chemical theory and computation.

[3]  Lorenz C. Blum,et al.  Chemical space as a source for new drugs , 2010 .

[4]  John J. Irwin,et al.  ZINC 15 – Ligand Discovery for Everyone , 2015, J. Chem. Inf. Model..

[5]  D. Herschbach,et al.  Molecular Partition Functions in Terms of Local Properties , 1959 .

[6]  Charlotte M. Deane,et al.  Deep Generative Models for 3D Linker Design , 2020, J. Chem. Inf. Model..

[7]  J. Maurice Rojas,et al.  Practical conversion from torsion space to Cartesian space for in silico protein synthesis , 2005, J. Comput. Chem..

[8]  Peter Ertl,et al.  Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions , 2009, J. Cheminformatics.

[9]  Michael K. Gilson,et al.  Thermodynamic and Differential Entropy under a Change of Variables , 2010, Entropy.

[10]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[11]  Alexandre Varnek,et al.  Estimation of the size of drug-like chemical space based on GDB-17 data , 2013, Journal of Computer-Aided Molecular Design.

[12]  Mark S. Gordon,et al.  Approximate Self‐Consistent Molecular‐Orbital Theory. VI. INDO Calculated Equilibrium Geometries , 1968 .

[13]  Kenneth S. Pitzer,et al.  Energy Levels and Thermodynamic Functions for Molecules with Internal Rotation: II. Unsymmetrical Tops Attached to a Rigid Frame , 1946 .

[14]  Jie Hou,et al.  Deep learning methods for protein torsion angle prediction , 2017, BMC Bioinformatics.

[15]  M. Gilson,et al.  Calculation of Molecular Configuration Integrals , 2003 .

[16]  M. Gilson,et al.  Coordinate Systems and the Calculation of Molecular Properties , 2002 .

[17]  Yuan-Ping Pang,et al.  Configurational entropy in protein-peptide binding: computational study of Tsg101 ubiquitin E2 variant domain with an HIV-derived PTAP nonapeptide. , 2009, Journal of molecular biology.

[18]  Gavin Brown,et al.  A New Perspective for Information Theoretic Feature Selection , 2009, AISTATS.

[19]  Jameed Hussain,et al.  Computationally Efficient Algorithm to Identify Matched Molecular Pairs (MMPs) in Large Data Sets , 2010, J. Chem. Inf. Model..

[20]  Rachelle J Bienstock,et al.  Computational methods for fragment-based ligand design: growing and linking. , 2015, Methods in molecular biology.

[21]  Thomas A. Halgren Merck molecular force field. I. Basis, form, scope, parameterization, and performance of MMFF94 , 1996, J. Comput. Chem..

[22]  Markus Fleck,et al.  Configurational Entropy Components and Their Contribution to Biomolecular Complex Formation , 2019, Journal of chemical theory and computation.

[23]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[24]  Qi Liu,et al.  Constrained Graph Variational Autoencoders for Molecule Design , 2018, NeurIPS.

[25]  Michael K Gilson,et al.  Extraction of configurational entropy from molecular simulations via an expansion approximation. , 2007, The Journal of chemical physics.

[26]  Bruce Tidor,et al.  Efficient calculation of molecular configurational entropies using an information theoretic approximation. , 2012, The journal of physical chemistry. B.

[27]  Yaoqi Zhou,et al.  Grid-based prediction of torsion angle probabilities of protein backbone and its application to discrimination of protein intrinsic disorder regions and selection of model structures , 2018, BMC Bioinformatics.

[28]  Alexei A. Efros,et al.  Toward Multimodal Image-to-Image Translation , 2017, NIPS.

[29]  James G. Lyons,et al.  Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning , 2015, Scientific Reports.

[30]  Jorge Numata,et al.  Balanced and Bias-Corrected Computation of Conformational Entropy Differences for Molecular Trajectories. , 2012, Journal of chemical theory and computation.

[31]  David Weininger,et al.  SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules , 1988, J. Chem. Inf. Comput. Sci..

[32]  Xuanyi Li,et al.  Chemical space exploration based on recurrent neural networks: applications in discovering kinase inhibitors , 2020, Journal of Cheminformatics.

[33]  Thomas A. Halgren MMFF VI. MMFF94s option for energy minimization studies , 1999, J. Comput. Chem..

[34]  Yan Li,et al.  Comparative Assessment of Scoring Functions: The CASF-2016 Update , 2018, J. Chem. Inf. Model..

[35]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[36]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[37]  M. Wieder,et al.  Dummy Atoms in Alchemical Free Energy Calculations , 2021, Journal of chemical theory and computation.

[38]  Regina Barzilay,et al.  Learning Multimodal Graph-to-Graph Translation for Molecular Optimization , 2018, ICLR.

[39]  Asher Mullard,et al.  The drug-maker's guide to the galaxy , 2017, Nature.

[40]  J. Baell,et al.  New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. , 2010, Journal of medicinal chemistry.