Graph Neural Network Based Coarse-Grained Mapping Prediction

The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation. It is still an open question about what is optimal for this choice and there is a need for theory. The current state-of-the art method is mapping operators manually selected by experts. In this work, we demonstrate an automated approach by viewing this problem as supervised learning where we seek to reproduce the mapping operators produced by experts. We present a graph neural network based CG mapping predictor called Deep Supervised Graph Partitioning Model (DSGPM) that treats mapping operators as a graph segmentation problem. DSGPM is trained on a novel dataset, Human-annotated Mappings (HAM), consisting of 1180 molecules with expert annotated mapping operators. HAM can be used to facilitate further research in this area. Our model uses a novel metric learning objective to produce high-quality atomic features that are used in spectral clustering. The results show that the DSGPM outperforms state-of-the-art methods in the field of graph segmentation. Finally, we find that predicted CG mapping operators indeed result in good CG MD models when used in simulation.

[1]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[2]  Klaus Schulten,et al.  Stability and dynamics of virus capsids described by coarse-grained modeling. , 2006, Structure.

[3]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jan Eric Lenssen,et al.  Fast Graph Representation Learning with PyTorch Geometric , 2019, ArXiv.

[5]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[6]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[7]  Michael A Webb,et al.  Graph-Based Approach to Systematic Molecular Coarse-Graining. , 2019, Journal of chemical theory and computation.

[8]  Alán Aspuru-Guzik,et al.  Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules , 2016, ACS central science.

[9]  Helgi I Ingólfsson,et al.  The power of coarse graining in biomolecular simulations , 2013, Wiley interdisciplinary reviews. Computational molecular science.

[10]  Jim Pfaendtner,et al.  A systematic methodology for defining coarse-grained sites in large biomolecules. , 2008, Biophysical journal.

[11]  James Bailey,et al.  Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance , 2010, J. Mach. Learn. Res..

[12]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[13]  Gregory A Voth,et al.  A multiscale coarse-graining method for biomolecular systems. , 2005, The journal of physical chemistry. B.

[14]  Milind Tambe,et al.  End to end learning and optimization on graphs , 2019, NeurIPS.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  Peter Sanders,et al.  Advanced Coarsening Schemes for Graph Partitioning , 2012, ACM J. Exp. Algorithmics.

[17]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[18]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[19]  Vijay S. Pande,et al.  MoleculeNet: a benchmark for molecular machine learning , 2017, Chemical science.

[20]  Chenliang Xu,et al.  Encoding and selecting coarse-grain mapping operators with hierarchical graphs. , 2018, The Journal of chemical physics.

[21]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[22]  Holger Gohlke,et al.  A natural coarse graining for simulating large biomolecular motion. , 2006, Biophysical journal.

[23]  Fei Xia,et al.  Constructing Optimal Coarse-Grained Sites of Huge Biomolecules by Fluctuation Maximization. , 2016, Journal of chemical theory and computation.

[24]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[25]  Brett M Savoie,et al.  Evidence of information limitations in coarse-grained models. , 2019, The Journal of chemical physics.

[26]  Raffaello Potestio,et al.  An information theory-based approach for optimal model reduction of biomolecules , 2020, 2004.03988.

[27]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[28]  D. Tieleman,et al.  Perspective on the Martini model. , 2013, Chemical Society reviews.

[29]  Xavier Periole,et al.  Combining an Elastic Network With a Coarse-Grained Molecular Force Field: Structure, Dynamics, and Intermolecular Recognition. , 2009, Journal of chemical theory and computation.

[30]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[31]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[32]  Razvan Pascanu,et al.  Relational inductive biases, deep learning, and graph networks , 2018, ArXiv.

[33]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[34]  Evan Bolton,et al.  PubChem 2019 update: improved access to chemical data , 2018, Nucleic Acids Res..

[35]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Chenliang Xu,et al.  Flattening Supervoxel Hierarchies by the Uniform Entropy Slice , 2013, 2013 IEEE International Conference on Computer Vision.

[37]  Aric Hagberg,et al.  Exploring Network Structure, Dynamics, and Function using NetworkX , 2008, Proceedings of the Python in Science Conference.

[38]  Azalia Mirhoseini,et al.  GAP: Generalizable Approximate Graph Partitioning Framework , 2019, ArXiv.

[40]  Yair Weiss,et al.  Segmentation using eigenvectors: a unifying view , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[41]  Keith Paton,et al.  An algorithm for finding a fundamental set of cycles of a graph , 1969, CACM.

[42]  Wujie Wang,et al.  Coarse-graining auto-encoders for molecular dynamics , 2018, npj Computational Materials.