A framework for uncertainty-aware visual analytics of proteins

Abstract Due to the limitations of existing experimental methods for capturing stereochemical molecular data, there usually is an inherent level of uncertainty present in models describing the conformation of macromolecules. This uncertainty can originate from various sources and can have a significant effect on algorithms and decisions based upon such models. Incorporating uncertainty in state-of-the-art visualization approaches for molecular data is an important issue to ensure that scientists analyzing the data are aware of the inherent uncertainty present in the representation of the molecular data. In this work, we introduce a framework that allows biochemists to explore molecular data in a familiar environment while including uncertainty information within the visualizations. Our framework is based on an anisotropic description of proteins that can be propagated along with required computations, providing multiple views that extend prominent visualization approaches to visually encode uncertainty of atom positions, allowing interactive exploration. We show the effectiveness of our approach by applying it to multiple real-world datasets and gathering user feedback.

[1]  William Schroeder,et al.  The Visualization Toolkit: An Object-Oriented Approach to 3-D Graphics , 1997 .

[2]  B. Braams,et al.  Uncertainty estimates for theoretical atomic and molecular data , 2016, 1603.05923.

[3]  L. Iakoucheva,et al.  Intrinsic disorder in cell-signaling and cancer-associated proteins. , 2002, Journal of molecular biology.

[4]  Paul Rosen,et al.  From Quantification to Visualization: A Taxonomy of Uncertainty Visualization Approaches , 2011, WoCoUQ.

[5]  Amitabh Varshney,et al.  Representing thermal vibrations and uncertainty in molecular surfaces , 2002, IS&T/SPIE Electronic Imaging.

[6]  Hans Hagen,et al.  Uncertainty-Awareness in Open Source Visualization Solutions , 2016 .

[7]  N. Guex,et al.  SWISS‐MODEL and the Swiss‐Pdb Viewer: An environment for comparative protein modeling , 1997, Electrophoresis.

[8]  Zheng Yuan,et al.  Prediction of protein B‐factor profiles , 2005, Proteins.

[9]  G. N. Ramachandran,et al.  Stereochemical criteria for polypeptide and protein chain conformations. II. Allowed conformations for a pair of peptide units. , 1965, Biophysical journal.

[10]  Abhishek Bhowmick,et al.  Statistical Framework for Uncertainty Quantification in Computational Molecular Modeling , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[11]  Lennart Nilsson,et al.  Rigidity versus flexibility: the dilemma of understanding protein thermal stability , 2015, The FEBS journal.

[12]  Ivan Viola,et al.  Visualization of Biomolecular Structures: State of the Art Revisited , 2017, Comput. Graph. Forum.

[13]  N. Tjandra,et al.  Structural analysis of the N-terminal domain of the human T-cell leukemia virus capsid protein. , 2001, Journal of molecular biology.

[14]  Janusz M. Bujnicki,et al.  MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins , 2012, BMC Bioinformatics.

[15]  Aleksey A. Porollo,et al.  Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D , 2007, BMC Bioinformatics.

[16]  Hans Hagen,et al.  From Theory to Usage: Requirements for successful Visualizations in Applications , 2016 .

[17]  G. Schulz,et al.  Structure of cyclodextrin glycosyltransferase refined at 2.0 A resolution. , 1991, Journal of molecular biology.

[18]  Arthur J Olson,et al.  Perspectives on Structural Molecular Biology Visualization: From Past to Present. , 2018, Journal of molecular biology.

[19]  P. Tompa,et al.  Does Intrinsic Disorder in Proteins Favor Their Interaction with Lipids? , 2019, Proteomics.

[20]  Bin Liu,et al.  UNCERTAINTY CLASSIFICATION AND VISUALIZATION OF MOLECULAR INTERFACES , 2013 .

[21]  Elizabeth A Komives,et al.  Hydrogen-exchange mass spectrometry for the study of intrinsic disorder in proteins. , 2013, Biochimica et biophysica acta.

[22]  Aaron Knoll,et al.  OSPRay - A CPU Ray Tracing Framework for Scientific Visualization , 2017, IEEE Transactions on Visualization and Computer Graphics.

[23]  G. Schulz,et al.  Catalytic center of cyclodextrin glycosyltransferase derived from X-ray structure analysis combined with site-directed mutagenesis. , 1992, Biochemistry.

[24]  Engelbert Buxbaum,et al.  Fundamentals of Protein Structure and Function , 2015, Springer International Publishing.

[25]  Chris R. Johnson Top Scientific Visualization Research Problems , 2004, IEEE Computer Graphics and Applications.

[26]  L. Iakoucheva,et al.  The importance of intrinsic disorder for protein phosphorylation. , 2004, Nucleic acids research.

[27]  Conrad C. Huang,et al.  UCSF ChimeraX: Structure visualization for researchers, educators, and developers , 2020, Protein science : a publication of the Protein Society.

[28]  Anthony Alan Clifford,et al.  Multivariate error analysis : a handbook of error propagation and calculation in many-parameter systems , 1973 .

[29]  Jianxin Gong,et al.  Clarifying the Standard Deviational Ellipse , 2002 .

[31]  Marc S. Cortese,et al.  High-throughput characterization of intrinsic disorder in proteins from the Protein Structure Initiative. , 2012, Journal of structural biology.

[32]  M. Sheelagh T. Carpendale,et al.  Empirical Studies in Information Visualization: Seven Scenarios , 2012, IEEE Transactions on Visualization and Computer Graphics.

[33]  Bin Wang,et al.  Confidence Analysis of Standard Deviational Ellipse and Its Extension into Higher Dimensional Euclidean Space , 2015, PloS one.

[34]  Ken Brodlie,et al.  A Review of Uncertainty in Data Visualization , 2012, Expanding the Frontiers of Visual Analytics and Visualization.

[35]  Hans Hagen,et al.  Uncertainty-Aware Ramachandran Plots , 2019, 2019 IEEE Pacific Visualization Symposium (PacificVis).

[36]  Thomas Ertl,et al.  Uncertainty Visualization for Secondary Structures of Proteins , 2018, 2018 IEEE Pacific Visualization Symposium (PacificVis).

[37]  Daniel A. Keim,et al.  The Role of Uncertainty, Awareness, and Trust in Visual Analytics , 2016, IEEE Transactions on Visualization and Computer Graphics.

[38]  Penny Rheingans,et al.  Visualization of Molecules with Positional Uncertainty , 1999, VisSym.

[39]  Oliviero Carugo,et al.  How large B-factors can be in protein crystal structures , 2018, BMC Bioinformatics.

[40]  Paul Hanly Furfey,et al.  A Note on Lefever's "Standard Deviational Ellipse" , 1927, American Journal of Sociology.

[41]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[42]  A. Ynnerman,et al.  Tracking Internal Frames of Reference for Consistent Molecular Distribution Functions , 2021, IEEE Transactions on Visualization and Computer Graphics.

[43]  Bernd Hamann,et al.  Modeling and Visualization of Uncertainty-Aware Geometry Using Multi-variate Normal Distributions , 2018, 2018 IEEE Pacific Visualization Symposium (PacificVis).

[44]  A. Pesce,et al.  Very high resolution structure of a trematode hemoglobin displaying a TyrB10-TyrE7 heme distal residue pair and high oxygen affinity. , 2001, Journal of molecular biology.

[45]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[46]  Min Wang,et al.  Thresher: determining the number of clusters while removing outliers , 2018, BMC Bioinformatics.

[47]  Lukasz Kurgan,et al.  Computational Prediction of Intrinsic Disorder in Proteins , 2017, Current protocols in protein science.