A geometric invariant-based framework for the analysis of protein conformational space

MOTIVATION Characterization of the restricted nature of the protein local conformational space has remained a challenge, thereby necessitating a computationally expensive conformational search in protein modeling. Moreover, owing to the lack of unilateral structural descriptors, conventional data mining techniques, such as clustering and classification, have not been applied in protein structure analysis. RESULTS We first map the local conformations in a fixed dimensional space by using a carefully selected suite of geometric invariants (GIs) and then reduce the number of dimensions via principal component analysis (PCA). Distribution of the conformations in the space spanned by the first four PCs is visualized as a set of conditional bivariate probability distribution plots, where the peaks correspond to the preferred conformations. The locations of the different canonical structures in the PC-space have been interpreted in the context of the weights of the GIs to the first four PCs. Clustering of the available conformations reveals that the number of preferred local conformations is several orders of magnitude smaller than that suggested previously. SUPPLEMENTARY INFORMATION www.it.iitb.ac.in/~ashish/bioinfo2005/.

[1]  P. Newstead Moduli Spaces and Vector Bundles: Geometric Invariant Theory , 2009 .

[2]  K. Ikeda,et al.  Visualization of conformational distribution of short to medium size segments in globular proteins and identification of local structural motifs , 2005, Protein science : a publication of the Protein Society.

[3]  Gregory E Sims,et al.  Protein conformational space in higher order phi-Psi maps. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  G. Christopher Hruska Geometric invariants of spaces with isolated flats , 2004 .

[5]  Ashish V. Tendulkar,et al.  Clustering of protein structural fragments reveals modular building block approach of nature. , 2004, Journal of molecular biology.

[6]  Baldomero Oliva,et al.  ArchDB: automated protein loop classification as a tool for structural genomics , 2004, Nucleic Acids Res..

[7]  Ashish V. Tendulkar,et al.  Parameterization and classification of the protein universe via geometric techniques. , 2003, Journal of molecular biology.

[8]  Ashish V. Tendulkar,et al.  Functional sites in protein families uncovered via an objective and automated graph theoretic approach. , 2003, Journal of molecular biology.

[9]  Amir H. Assadi,et al.  A learning theoretic approach to perceptual geometry in natural scenes , 2001, Neurocomputing.

[10]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[11]  Patrice Koehl,et al.  The ASTRAL compendium for protein structure and sequence analysis , 2000, Nucleic Acids Res..

[12]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[13]  A Tramontano,et al.  Homology modeling with low sequence identity. , 1998, Methods.

[14]  Baldomero Oliva,et al.  An automated classification of the structure of protein loops. , 1997, Journal of molecular biology.

[15]  S. Wodak,et al.  Automatic classification and analysis of alpha alpha-turn motifs in proteins. , 1996, Journal of molecular biology.

[16]  R. Srinivasan,et al.  Rules for alpha-helix termination by glycine. , 1994, Science.

[17]  Roderick E. Hubbard,et al.  Analysis of Cα geometry in protein structures , 1994 .

[18]  T J Oldfield,et al.  Analysis of C alpha geometry in protein structures. , 1994, Proteins.

[19]  T. Blundell,et al.  Definition of general topological equivalence in protein structures. A procedure involving comparison of properties and relationships through simulated annealing and dynamic programming. , 1990, Journal of molecular biology.

[20]  B. L. Sibanda,et al.  Conformation of beta-hairpins in protein structures. A systematic classification with applications to modelling by homology, electron density fitting and protein engineering. , 1989, Journal of molecular biology.

[21]  W. Merlevede,et al.  Regulation of the ATP, Mg-dependent protein phosphatase by the modulator protein , 1986 .

[22]  B. L. Sibanda,et al.  β-Hairpin families in globular proteins , 1985, Nature.

[23]  B. L. Sibanda,et al.  Beta-hairpin families in globular proteins. , 1985, Nature.

[24]  J. Richardson,et al.  The anatomy and taxonomy of protein structure. , 1981, Advances in protein chemistry.

[25]  G. N. Ramachandran,et al.  Conformation of polypeptides and proteins. , 1968, Advances in protein chemistry.

[26]  G. N. Ramachandran,et al.  Stereochemistry of polypeptide chain configurations. , 1963, Journal of molecular biology.

[27]  H. Weyl The Classical Groups , 1940 .