Improved data visualization techniques for analyzing macromolecule structural changes

The empirical phase diagram (EPD) is a colored representation of overall structural integrity and conformational stability of macromolecules in response to various environmental perturbations. Numerous proteins and macromolecular complexes have been analyzed by EPDs to summarize results from large data sets from multiple biophysical techniques. The current EPD method suffers from a number of deficiencies including lack of a meaningful relationship between color and actual molecular features, difficulties in identifying contributions from individual techniques, and a limited ability to be interpreted by color‐blind individuals. In this work, three improved data visualization approaches are proposed as techniques complementary to the EPD. The secondary, tertiary, and quaternary structural changes of multiple proteins as a function of environmental stress were first measured using circular dichroism, intrinsic fluorescence spectroscopy, and static light scattering, respectively. Data sets were then visualized as (1) RGB colors using three‐index EPDs, (2) equiangular polygons using radar charts, and (3) human facial features using Chernoff face diagrams. Data as a function of temperature and pH for bovine serum albumin, aldolase, and chymotrypsin as well as candidate protein vaccine antigens including a serine threonine kinase protein (SP1732) and surface antigen A (SP1650) from S. pneumoniae and hemagglutinin from an H1N1 influenza virus are used to illustrate the advantages and disadvantages of each type of data visualization technique.

[1]  Michael D. Lee,et al.  An Empirical Evaluation of Chernoff Faces, Star Glyphs, and Spatial Visualisations for Binary Data , 2003, InVis.au.

[2]  Wei-Tek Tsai,et al.  SaaS performance and scalability evaluation in clouds , 2011, Proceedings of 2011 IEEE 6th International Symposium on Service Oriented System (SOSE).

[3]  S. Deeb,et al.  The molecular basis of variation in human color vision , 2005, Clinical genetics.

[4]  Aaron M. Smalter,et al.  Stability of the Clostridium botulinum type A neurotoxin complex: an empirical phase diagram based approach. , 2007, Molecular pharmaceutics.

[5]  S. Joshi,et al.  Investigation of protein conformational stability employing a multimodal spectrometer. , 2011, Analytical chemistry.

[6]  J. Ralston,et al.  Solution Behavior of IFN-β-1a: An Empirical Phase Diagram Based Approach , 2005 .

[7]  Haihong Fan,et al.  Effects of solutes on empirical phase diagrams of human fibroblast growth factor 1. , 2007, Journal of pharmaceutical sciences.

[8]  C Russell Middaugh,et al.  Preformulation characterization of an aluminum salt-adjuvanted trivalent recombinant protein-based vaccine candidate against Streptococcus pneumoniae. , 2012, Journal of pharmaceutical sciences.

[9]  C Russell Middaugh,et al.  Derivative absorbance spectroscopy and protein phase diagrams as tools for comprehensive protein characterization: a bGCSF case study. , 2003, Journal of pharmaceutical sciences.

[10]  Tim J Kamerzell,et al.  Using empirical phase diagrams to understand the role of intramolecular dynamics in immunoglobulin G stability. , 2009, Journal of pharmaceutical sciences.

[11]  B. S. Everitt,et al.  Visual Techniques for Representing Multivariate Data , 1975 .

[12]  Richard F. Riesenfeld,et al.  A Survey of Radial Methods for Information Visualization , 2009, IEEE Transactions on Visualization and Computer Graphics.

[13]  Akihisa Nonoyama,et al.  A biophysical characterization of the peptide drug pramlintide (AC137) using empirical phase diagrams. , 2008, Journal of pharmaceutical sciences.

[14]  M Joan Saary,et al.  Radar plots: a useful way for presenting multivariate health care data. , 2008, Journal of clinical epidemiology.

[15]  Vidyashankara G. Iyer,et al.  Formulation development of a plant-derived h1n1 influenza vaccine containing purified recombinant hemagglutinin antigen , 2012, Human vaccines & immunotherapeutics.

[16]  Herman Chernoff,et al.  The Use of Faces to Represent Points in k- Dimensional Space Graphically , 1973 .

[17]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[18]  Nathaniel R Maddux,et al.  An improved methodology for multidimensional high-throughput preformulation characterization of protein conformational stability. , 2012, Journal of pharmaceutical sciences.

[19]  Sayaka Yoshimura,et al.  Development of a multi-dimensional scale for PDD and ADHD. , 2011, Research in developmental disabilities.

[20]  A. Savitzky,et al.  Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[21]  H. Chernoff,et al.  Effect on Classification Error of Random Permutations of Features in Representing Multivariate Data by Faces , 1975 .

[22]  John M. Chambers,et al.  Graphical Methods for Data Analysis , 1983 .

[23]  Daniel A. Keim,et al.  Information Visualization and Visual Data Mining , 2002, IEEE Trans. Vis. Comput. Graph..

[24]  Xiaogang Wang,et al.  A roadmap of clustering algorithms: finding a match for a biomedical application , 2008, Briefings Bioinform..

[25]  Yu Zhang,et al.  A new method for the evaluation of gait pathology: a radar chart approach , 2007, i-CREATe '07.

[26]  S. Joshi,et al.  Multidimensional methods for the formulation of biopharmaceuticals and vaccines. , 2011, Journal of pharmaceutical sciences.

[27]  C. Russell Middaugh,et al.  An Empirical Phase Diagram–High‐Throughput Screening Approach to the Characterization and Formulation of Biopharmaceuticals , 2010 .

[28]  Jihun Lee,et al.  An empirical phase diagram approach to investigate conformational stability of “second‐generation” functional mutants of acidic fibroblast growth factor‐1 , 2012, Protein science : a publication of the Protein Society.