CoeViz: A Web-Based Integrative Platform for Interactive Visualization of Large Similarity and Distance Matrices

Similarity and distance matrices are general data structures that describe reciprocal relationships between the objects within a given dataset. Commonly used methods for representation of these matrices include heatmaps, hierarchical trees, dimensionality reduction, and various types of networks. However, despite a well-developed foundation for the visualization of such representations, the challenge of creating an interactive view that would allow for quick data navigation and interpretation remains largely unaddressed. This problem becomes especially evident for large matrices with hundreds or thousands objects. In this work, we present a web-based platform for the interactive analysis of large (dis-)similarity matrices. It consists of four major interconnected and synchronized components: a zoomable heatmap, interactive hierarchical tree, scalable circular relationship diagram, and 3D multi-dimensional scaling (MDS) scatterplot. We demonstrate the use of the platform for the analysis of amino acid covariance data in proteins as part of our previously developed CoeViz tool. The web-platform enables quick and focused analysis of protein features, such as structural domains and functional sites.

[1]  Aleksey A. Porollo,et al.  CoeViz: a web-based tool for coevolution analysis of protein residues , 2016, BMC Bioinformatics.

[2]  D. Baker,et al.  Robust and accurate prediction of residue–residue interactions across protein interfaces using evolutionary information , 2014, eLife.

[3]  Jeffrey Heer,et al.  D³ Data-Driven Documents , 2011, IEEE Transactions on Visualization and Computer Graphics.

[4]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[5]  Jeffrey Heer,et al.  SpanningAspectRatioBank Easing FunctionS ArrayIn ColorIn Date Interpolator MatrixInterpola NumObjecPointI Rectang ISchedu Parallel Pause Scheduler Sequen Transition Transitioner Transiti Tween Co DelimGraphMLCon IData JSONCon DataField DataSc Dat DataSource Data DataUtil DirtySprite LineS RectSprite , 2011 .

[6]  Aleksey A. Porollo,et al.  Prediction‐based fingerprints of protein–protein interactions , 2006, Proteins.

[7]  Walter Zucchini,et al.  RGL : A R-library for 3 D visualization with OpenGL , 2003 .

[8]  Jie Hou,et al.  ConEVA: a toolbox for comprehensive assessment of protein contacts , 2016, BMC Bioinformatics.

[9]  Aleksey A. Porollo,et al.  POLYVIEW: a flexible visualization tool for structural and functional annotations of proteins , 2004, Bioinform..

[10]  Robert M. Hanson,et al.  Jmol – a paradigm shift in crystallographic visualization , 2010 .

[11]  Thomas A. Hopf,et al.  Protein structure prediction from sequence variation , 2012, Nature Biotechnology.

[12]  Aleksey A. Porollo,et al.  Versatile annotation and publication quality visualization of protein complexes using POLYVIEW-3D , 2007, BMC Bioinformatics.