Visual Analytics for Deep Embeddings of Large Scale Molecular Dynamics Simulations

Molecular Dynamics (MD) simulation have been emerging as an excellent candidate for understanding complex atomic and molecular scale mechanism of bio-molecules that control essential bio-physical phenomenon in a living organism. But this MD technique produces large-size and long-timescale data that are inherently high-dimensional and occupies many terabytes of data. Processing this immense amount of data in a meaningful way is becoming increasingly difficult. Therefore, specific dimensionality reduction algorithm using deep learning technique has been employed here to embed the high-dimensional data in a lower-dimension latent space that still preserves the inherent molecular characteristics i.e. retains biologically meaningful information. Subsequently, the results of the embedding models are visualized for model evaluation and analysis of the extracted underlying features. However, most of the existing visualizations for embeddings have limitations in evaluating the embedding models and understanding the complex simulation data. We propose an interactive visual analytics system for embeddings of MD simulations to not only evaluate and explain an embedding model but also analyze various characteristics of the simulations. Our system enables exploration and discovery of meaningful and semantic embedding results and supports the understanding and evaluation of results by the quantitatively described features of the MD simulations (even without specific labels).

[1]  Minsuk Kahng,et al.  Visual Analytics in Deep Learning: An Interrogative Survey for the Next Frontiers , 2018, IEEE Transactions on Visualization and Computer Graphics.

[2]  Gianni De Fabritiis,et al.  Dimensionality reduction methods for molecular simulations , 2017, ArXiv.

[3]  Arvind Ramanathan,et al.  Mechanism of glucocerebrosidase activation and dysfunction in Gaucher disease unraveled by molecular dynamics and deep learning , 2019, Proceedings of the National Academy of Sciences.

[4]  Shantenu Jha,et al.  Deep Generative Model Driven Protein Folding Simulations , 2019, PARCO.

[5]  Shang Gao,et al.  Deep clustering of protein folding simulations , 2018, BMC Bioinformatics.

[6]  H. Berendsen,et al.  Essential dynamics of proteins , 1993, Proteins.

[7]  Ben Shneiderman,et al.  Direct Manipulation: A Step Beyond Programming Languages , 1983, Computer.

[8]  Minsuk Kahng,et al.  ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models , 2017, IEEE Transactions on Visualization and Computer Graphics.

[9]  Quan Li,et al.  EmbeddingVis: A Visual Analytics Approach to Comparative Network Embedding Inspection , 2018, 2018 IEEE Conference on Visual Analytics Science and Technology (VAST).

[10]  Li Han,et al.  Evaluation of Dimensionality-reduction Methods from Peptide Folding-unfolding Simulations. , 2013, Journal of chemical theory and computation.

[11]  Olivier Bernard,et al.  Aqueous solutions of tetraalkylammonium halides: ion hydration, dynamics and ion-ion interactions in light of steric effects. , 2014, Physical chemistry chemical physics : PCCP.

[12]  Arvind Ramanathan,et al.  Towards Exascale Bio-molecular Simulations with Artificial Intelligence Workflows , 2019 .

[13]  Helgi I Ingólfsson,et al.  Computational ‘microscopy’ of cellular membranes , 2016, Journal of Cell Science.

[14]  Oliver Beckstein,et al.  MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations , 2016, SciPy.

[15]  Mohammad M. Sultan,et al.  Variational encoding of complex dynamics. , 2017, Physical review. E.

[16]  Monojoy Goswami,et al.  Enhanced Dynamics of Hydrated tRNA on Nanodiamond Surfaces: A Combined Neutron Scattering and MD Simulation Study. , 2016, The journal of physical chemistry. B.

[17]  Thomas Proffen,et al.  An automated analysis workflow for optimization of force-field parameters using neutron scattering data , 2017, J. Comput. Phys..

[18]  M. Sheelagh T. Carpendale,et al.  A framework for unifying presentation space , 2001, UIST '01.

[19]  Andrej J. Savol,et al.  Quantifying the Sources of Kinetic Frustration in Folding Simulations of Small Proteins , 2014, Journal of chemical theory and computation.

[20]  Monojoy Goswami,et al.  Dynamical disparity between hydration shell water and RNA in a hydrated RNA system , 2018, Physical Review E.

[21]  Florian Heimerl,et al.  Interactive Analysis of Word Vector Embeddings , 2018, Comput. Graph. Forum.

[22]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[23]  G. W. Furnas,et al.  Generalized fisheye views , 1986, CHI '86.

[24]  Valerio Pascucci,et al.  Visual Exploration of Semantic Relationships in Neural Word Embeddings , 2018, IEEE Transactions on Visualization and Computer Graphics.

[25]  Jörg Gsponer,et al.  Molecular dynamics simulations of protein folding from the transition state , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Manojit Sarkar,et al.  Graphical fisheye views of graphs , 1992, CHI.

[27]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[28]  Oliver Beckstein,et al.  MDAnalysis: A toolkit for the analysis of molecular dynamics simulations , 2011, J. Comput. Chem..

[29]  Martin Wattenberg,et al.  Embedding Projector: Interactive Visualization and Interpretation of Embeddings , 2016, ArXiv.

[30]  Arvind Ramanathan,et al.  Towards Native Execution of Deep Learning on a Leadership-Class HPC System , 2019, 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[31]  J. P. Grossman,et al.  Biomolecular simulation: a computational microscope for molecular biology. , 2012, Annual review of biophysics.