Three dimensional shape comparison of flexible proteins using the local-diameter descriptor

BackgroundTechniques for inferring the functions of the protein by comparing their shape similarity have been receiving a lot of attention. Proteins are functional units and their shape flexibility occupies an essential role in various biological processes. Several shape descriptors have demonstrated the capability of protein shape comparison by treating them as rigid bodies. But this may give rise to an incorrect comparison of flexible protein shapes.ResultsWe introduce an efficient approach for comparing flexible protein shapes by adapting a local diameter (LD) descriptor. The LD descriptor, developed recently to handle skeleton based shape deformations [1], is adapted in this work to capture the invariant properties of shape deformations caused by the motion of the protein backbone. Every sampled point on the protein surface is assigned a value measuring the diameter of the 3D shape in the neighborhood of that point. The LD descriptor is built in the form of a one dimensional histogram from the distribution of the diameter values. The histogram based shape representation reduces the shape comparison problem of the flexible protein to a simple distance calculation between 1D feature vectors. Experimental results indicate how the LD descriptor accurately treats the protein shape deformation. In addition, we use the LD descriptor for protein shape retrieval and compare it to the effectiveness of conventional shape descriptors. A sensitivity-specificity plot shows that the LD descriptor performs much better than the conventional shape descriptors in terms of consistency over a family of proteins and discernibility across families of different proteins.ConclusionOur study provides an effective technique for comparing the shape of flexible proteins. The experimental results demonstrate the insensitivity of the LD descriptor to protein shape deformation. The proposed method will be potentially useful for molecule retrieval with similar shapes and rapid structure retrieval for proteins. The demos and supplemental materials are available on https://engineering.purdue.edu/PRECISE/LDD.

[1]  Rachel Kolodny,et al.  Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures. , 2005, Journal of molecular biology.

[2]  Thomas Lengauer,et al.  Moment invariants as shape recognition technique for comparing protein binding sites , 2007, Bioinform..

[3]  M. L. Connolly Measurement of protein surface shape by solid angles , 1986 .

[4]  P E Bourne,et al.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. , 1998, Protein engineering.

[5]  Michael G. Strintzis,et al.  Three-Dimensional Shape-Structure Comparison Method for Protein Classification , 2006, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[6]  J. A. Grant,et al.  A Gaussian Description of Molecular Shape , 1995 .

[7]  M. Gerstein,et al.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. , 2000, Journal of molecular biology.

[8]  Patrick Cousot Program analysis: the abstract interpretation perspective , 1996, CSUR.

[9]  Cláudio M. Gomes,et al.  Conformational States and Protein Stability from a Proteomic Perspective , 2007 .

[10]  Chris Sander,et al.  Touring protein fold space with Dali/FSSP , 1998, Nucleic Acids Res..

[11]  Dietmar Saupe,et al.  3D Model Retrieval , 2001 .

[12]  Lora Mak,et al.  An extension of spherical harmonics to region-based rotationally invariant descriptors for molecular shape description and comparison. , 2008, Journal of molecular graphics & modelling.

[13]  W. Graham Richards,et al.  Ultrafast shape recognition to search compound databases for similar molecular shapes , 2007, J. Comput. Chem..

[14]  J. Janin,et al.  Structural domains in proteins and their role in the dynamics of protein function. , 1983, Progress in biophysics and molecular biology.

[15]  Karthik Ramani,et al.  Developing an engineering shape benchmark for CAD models , 2006, Comput. Aided Des..

[16]  K Schulten,et al.  Protein domain movements: detection of rigid domains and visualization of hinges in comparisons of atomic coordinates , 1997, Proteins.

[17]  William J Welsh,et al.  Shape Signatures: speeding up computer aided drug discovery. , 2006, Drug discovery today.

[18]  Bin Li,et al.  Characterization of local geometry of protein surfaces with the visibility criterion , 2008, Proteins.

[19]  Mark Gerstein,et al.  FlexOracle: predicting flexible hinges by identification of stable domains , 2007, BMC Bioinformatics.

[20]  William J. Welsh,et al.  Enrichment of Ligands for the Serotonin Receptor Using the Shape Signatures Approach , 2005, J. Chem. Inf. Model..

[21]  K. Mizuguchi,et al.  Comparison of spatial arrangements of secondary structural elements in proteins. , 1995, Protein engineering.

[22]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[23]  Daniel Cohen-Or,et al.  Consistent mesh partitioning and skeletonisation using the shape diameter function , 2008, The Visual Computer.

[24]  Arie E. Kaufman Volume visualization , 1996, CSUR.

[25]  Janet M. Thornton,et al.  Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons , 2005, Bioinform..

[26]  Jun-Hai Yong,et al.  A quasi-Monte Carlo method for computing areas of point-sampled surfaces , 2006, Comput. Aided Des..

[27]  C. Anfinsen Principles that govern the folding of protein chains. , 1973, Science.

[28]  Mark Gerstein,et al.  Using Iterative Dynamic Programming to Obtain Accurate Pairwise and Multiple Alignments of Protein Structures , 1996, ISMB.

[29]  R. Lathrop The protein threading problem with sequence amino acid interaction preferences is NP-complete. , 1994, Protein engineering.

[30]  Ramaswamy Nilakantan,et al.  New method for rapid characterization of molecular shapes: applications in drug design , 1993, J. Chem. Inf. Comput. Sci..

[31]  Jeng-Shyang Pan,et al.  A New 3D Shape Descriptor Based on Rotation , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[32]  Mark Gerstein,et al.  MolMovDB: analysis and visualization of conformational change and structural flexibility , 2003, Nucleic Acids Res..

[33]  Bin Li,et al.  Fast protein tertiary structure retrieval based on global surface shape similarity , 2008, Proteins.

[34]  Daniel Cohen-Or,et al.  Volume graphics , 1993, Computer.

[35]  Hwan Pyo Moon,et al.  MATHEMATICAL THEORY OF MEDIAL AXIS TRANSFORM , 1997 .

[36]  Guillermo Moyna,et al.  Shape signatures: a new approach to computer-aided ligand- and receptor-based drug design. , 2003, Journal of medicinal chemistry.

[37]  M. Baker,et al.  Identification of secondary structure elements in intermediate-resolution density maps. , 2007, Structure.

[38]  A. Lesk,et al.  The relation between the divergence of sequence and structure in proteins. , 1986, The EMBO journal.

[39]  K Henrick,et al.  Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. , 2004, Acta crystallographica. Section D, Biological crystallography.

[40]  Ariel Shamir,et al.  Pose-Oblivious Shape Signature , 2007, IEEE Transactions on Visualization and Computer Graphics.

[41]  M. L. Connolly Solvent-accessible surfaces of proteins and nucleic acids. , 1983, Science.

[42]  Bernard Chazelle,et al.  Shape distributions , 2002, TOGS.

[43]  Bialek,et al.  Properties and origins of protein secondary structure. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[44]  Pedro J. Ballester,et al.  Ultrafast shape recognition for similarity search in molecular databases , 2007, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[45]  Heather A Carlson,et al.  Gaussian-weighted RMSD superposition of proteins: a structural comparison for flexible proteins and predicted protein structures. , 2006, Biophysical journal.

[46]  D. Jacobs,et al.  Protein flexibility predictions using graph theory , 2001, Proteins.

[47]  Karthik Ramani,et al.  Using least median of squares for structural superposition of flexible proteins , 2009, BMC Bioinformatics.

[48]  B. Li,et al.  Rapid comparison of properties on protein surface , 2008, Proteins.

[49]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..