Application of New Multiresolution Methods for the Comparison of Biomolecular Electrostatic Properties in the Absence of Global Structural Similarity

In this paper we present a method for the multi-resolution comparison of biomolecular electrostatic potentials without the need for global structural alignment of the biomolecules. The underlying computational geometry algorithm uses multi-resolution attributed contour trees (MACTs) to compare the topological features of volumetric scalar fields. We apply the MACTs to compute electrostatic similarity metrics for a large set of protein chains with varying degrees of sequence, structure, and function similarity. For calibration, we also compute similarity metrics for these chains by a more traditional approach based upon 3D structural alignment and analysis of Carbo similarity indices. Moreover, because the MACT approach does not rely upon pairwise structural alignment, its accuracy and efficiency promises to perform well on future large-scale classification efforts across groups of structurally-diverse proteins. The MACT method discriminates between protein chains at a level comparable to the Carbo similarity index method; i.e., it is able to accurately cluster proteins into functionally-relevant groups which demonstrate strong dependence on ligand binding sites. The results of the analyses are available from the linked web databases http://ccvweb.cres.utexas.edu/MolSignature/ and http://agave.wustl.edu/similarity/. The MACT analysis tools are available as part of the public domain library of the Topological Analysis and Quantitative Tools (TAQT) from the Center of Computational Visualization, at the University of Texas at Austin (http://ccvweb.csres.utexas.edu/software). The Carbo software is available for download with the open-source APBS software package at http://apbs.sf.net/.

[1]  G Klebe,et al.  Improving macromolecular electrostatics calculations. , 1999, Protein engineering.

[2]  R C Wade,et al.  Electrostatic steering and ionic tethering in enzyme-ligand binding: insights from simulations. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[3]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[4]  Gabriele Ausiello,et al.  SURFACE: a database of protein surface regions for functional annotation , 2004, Nucleic Acids Res..

[5]  Donald E. Knuth,et al.  The art of computer programming: V.1.: Fundamental algorithms , 1997 .

[6]  R. Nussinov,et al.  Protein–protein interactions: Structurally conserved residues distinguish between binding sites and exposed protein surfaces , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Janet M. Thornton,et al.  PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids , 2004, Nucleic Acids Res..

[8]  Nathan A. Baker,et al.  Adaptive multilevel finite element solution of the Poisson–Boltzmann equation I. Algorithms and examples , 2000 .

[9]  Patrice Koehl,et al.  The ASTRAL Compendium in 2004 , 2003, Nucleic Acids Res..

[10]  Per Jambeck,et al.  Conservation of electrostatic properties within enzyme families and superfamilies. , 2003, Biochemistry.

[11]  Chris Sander,et al.  Protein folds and families: sequence and structure alignments , 1999, Nucleic Acids Res..

[12]  R. Greaves,et al.  Active site identification through geometry-based and sequence profile-based calculations: burial of catalytic clefts. , 2005, Journal of molecular biology.

[13]  Alex Bateman,et al.  The InterPro Database, 2003 brings increased coverage and new features , 2003, Nucleic Acids Res..

[14]  M. Ondrechen,et al.  THEMATICS: A simple computational predictor of enzyme function from structure , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  K. Sharp,et al.  Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models , 1994 .

[16]  Razif R. Gabdoulline,et al.  Protein interaction property similarity analysis , 2001 .

[17]  Nathan A. Baker,et al.  PDB2PQR: an automated pipeline for the setup of Poisson-Boltzmann electrostatics calculations , 2004, Nucleic Acids Res..

[18]  Edward E. Hodgkin,et al.  Molecular similarity based on electrostatic potential and electric field , 1987 .

[19]  K. Kinoshita,et al.  Identification of protein functions from a molecular surface database, eF-site , 2004, Journal of Structural and Functional Genomics.

[20]  Traian Sulea,et al.  Profiling charge complementarity and selectivity for binding at the protein surface. , 2003, Biophysical journal.

[21]  Gene Ontology Consortium The Gene Ontology (GO) database and informatics resource , 2003 .

[22]  Ilya N. Shindyalov,et al.  A Database of Pairwise Aligned 3-D Structures for the Acetylcholinesterases, Lipases and Other Homologous Proteins , 1998 .

[23]  Catherine Burt,et al.  The application of molecular similarity calculations , 1990 .

[24]  J L Sussman,et al.  Electrotactins: a class of adhesion proteins with conserved electrostatic and structural motifs. , 1998, Protein engineering.

[25]  Ann M. Richard,et al.  Quantitative comparison of molecular electrostatic potentials for structure‐activity studies , 1991 .

[26]  Donald E. Knuth,et al.  The Art of Computer Programming, Volume I: Fundamental Algorithms, 2nd Edition , 1997 .

[27]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[28]  W. Im,et al.  Continuum solvation model: Computation of electrostatic forces from numerical solutions to the Poisson-Boltzmann equation , 1998 .

[29]  Ramon Carbo,et al.  How similar is a molecule to another? An electron density measure of similarity between two molecular structures , 1980 .

[30]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[31]  C. Chothia,et al.  The atomic structure of protein-protein recognition sites. , 1999, Journal of molecular biology.

[32]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[33]  David C. Jones,et al.  CATH--a hierarchic classification of protein domain structures. , 1997, Structure.

[34]  S Karlin,et al.  Clusters of charged residues in protein three-dimensional structures. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[35]  Rebecca C Wade,et al.  Determinants of functionality in the ubiquitin conjugating enzyme family. , 2004, Structure.

[36]  L. L. Lloyd,et al.  Enzyme nomenclature — Recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology: Academic Press Ltd, London, UK, 1992. xiii + 862 pp. Price £40.00. ISBN 0-12-227165-3 , 1994 .

[37]  David Thomas,et al.  The Art in Computer Programming , 2001 .

[38]  Bruce Tidor,et al.  Optimization of binding electrostatics: Charge complementarity in the barnase‐barstar protein complex , 2001, Protein science : a publication of the Protein Society.

[39]  Robert D. Finn,et al.  The Pfam protein families database , 2004, Nucleic Acids Res..

[40]  A. Elcock Prediction of functionally important residues based solely on the computed energetics of protein structure. , 2001, Journal of molecular biology.

[41]  Chandrajit L. Bajaj,et al.  Affine Invariant Comparison of Molecular Shapes with Properties , 2006 .

[42]  P E Bourne,et al.  An alternative view of protein fold space , 2000, Proteins.

[43]  R. Wade,et al.  Classification of protein sequences by homology modeling and quantitative analysis of electrostatic similarity , 1999, Proteins.

[44]  J Novotny,et al.  Electrostatic fields in antibodies and antibody/antigen complexes. , 1992, Progress in biophysics and molecular biology.

[45]  Valerio Pascucci,et al.  The contour spectrum , 1997, Proceedings. Visualization '97 (Cat. No. 97CB36155).

[46]  Philip E Bourne,et al.  Structure comparison and alignment. , 2003, Methods of biochemical analysis.

[47]  Michael J. Holst,et al.  Adaptive multilevel finite element solution of the Poisson–Boltzmann equation I. Algorithms and examples , 2001 .

[48]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[49]  Nathan A. Baker,et al.  Electrostatics of nanosystems: Application to microtubules and the ribosome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[50]  Barry Honig,et al.  Electrostatic control of the membrane targeting of C2 domains. , 2002, Molecular cell.

[51]  J. Thornton,et al.  Predicting protein function from sequence and structural data. , 2005, Current opinion in structural biology.

[52]  E. Knapp,et al.  Structural alignment of ferredoxin and flavodoxin based on electrostatic potentials: Implications for their interactions with photosystem I and ferredoxin‐NADP reductase , 2000, Proteins.

[53]  J. Warwicker,et al.  Enzyme/non-enzyme discrimination and prediction of enzyme active site location using charge-based methods. , 2004, Journal of molecular biology.

[54]  Nathan A. Baker,et al.  Poisson-Boltzmann Methods for Biomolecular Electrostatics , 2004, Numerical Computer Methods, Part D.

[55]  A. McCoy,et al.  Electrostatic complementarity at protein/protein interfaces. , 1997, Journal of molecular biology.

[56]  Valerio Pascucci,et al.  Contour trees and small seed sets for isosurface traversal , 1997, SCG '97.

[57]  Eric W. Weisstein,et al.  The CRC concise encyclopedia of mathematics , 1999 .

[58]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[59]  A J Olson,et al.  Electrostatic orientation of the electron-transfer complex between plastocyanin and cytochrome c. , 1991, The Journal of biological chemistry.

[60]  Jack Snoeyink,et al.  Computing contour trees in all dimensions , 2000, SODA '00.