ConSole: using modularity of Contact maps to locate Solenoid domains in protein structures

BackgroundPeriodic proteins, characterized by the presence of multiple repeats of short motifs, form an interesting and seldom-studied group. Due to often extreme divergence in sequence, detection and analysis of such motifs is performed more reliably on the structural level. Yet, few algorithms have been developed for the detection and analysis of structures of periodic proteins.ResultsConSole recognizes modularity in protein contact maps, allowing for precise identification of repeats in solenoid protein structures, an important subgroup of periodic proteins. Tests on benchmarks show that ConSole has higher recognition accuracy as compared to Raphael, the only other publicly available solenoid structure detection tool. As a next step of ConSole analysis, we show how detection of solenoid repeats in structures can be used to improve sequence recognition of these motifs and to detect subtle irregularities of repeat lengths in three solenoid protein families.ConclusionsThe ConSole algorithm provides a fast and accurate tool to recognize solenoid protein structures as a whole and to identify individual solenoid repeat units from a structure. ConSole is available as a web-based, interactive server and is available for download at http://console.sanfordburnham.org.

[1]  Heng Huang,et al.  Solenoid and non-solenoid protein recognition using stationary wavelet packet transform , 2010, Bioinform..

[2]  B. V. K. Vijaya Kumar,et al.  Correlation Pattern Recognition , 2002 .

[3]  Piero Fariselli,et al.  The pros and cons of predicting protein contact maps. , 2008, Methods in molecular biology.

[4]  S. Akira,et al.  Toll-like receptors and innate immunity , 2006, Journal of Molecular Medicine.

[5]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[6]  Aaron M. Newman,et al.  XSTREAM: A practical algorithm for identification and architecture modeling of tandem repeats in protein sequences , 2007, BMC Bioinformatics.

[7]  Jun S. Liu,et al.  Gibbs motif sampling: Detection of bacterial outer membrane protein repeats , 1995, Protein science : a publication of the Protein Society.

[8]  Lode Wyns,et al.  Triose-phosphate Isomerase (TIM) of the Psychrophilic BacteriumVibrio marinus , 1998, The Journal of Biological Chemistry.

[9]  Pierre Baldi,et al.  Assessing the accuracy of prediction algorithms for classification: an overview , 2000, Bioinform..

[10]  A. Kajava,et al.  Review: proteins with repeated sequence--structural prediction and modeling. , 2001, Journal of structural biology.

[11]  Manfred J. Sippl,et al.  Detecting Repetitions and Periodicities in Proteins by Tiling the Structural Space , 2013, The journal of physical chemistry. B.

[12]  Adam Godzik,et al.  Multiple flexible structure alignment using partial order graphs , 2005, Bioinform..

[13]  Andrey V Kajava,et al.  Tandem repeats in proteins: from sequence to structure. , 2012, Journal of structural biology.

[14]  E. Bailes,et al.  Armadillo-repeat protein functions: questions for little creatures. , 2010, Trends in cell biology.

[15]  L. Wyns,et al.  Triose-phosphate isomerase (TIM) of the psychrophilic bacterium Vibrio marinus. Kinetic and structural properties. , 1998, The Journal of biological chemistry.

[16]  S. Akira,et al.  Toll-like receptors and their crosstalk with other innate receptors in infection and immunity. , 2011, Immunity.

[17]  Bartek Wilczynski,et al.  Biopython: freely available Python tools for computational molecular biology and bioinformatics , 2009, Bioinform..

[18]  Krishna Sekar,et al.  ProSTRIP: A method to find similar structural repeats in three-dimensional protein structures , 2010, Comput. Biol. Chem..

[19]  Michael Lappe,et al.  CMView: Interactive contact map visualization and analysis , 2011, Bioinform..

[20]  Johannes Söding,et al.  De novo identification of highly diverged protein repeats by probabilistic consistency , 2008, Bioinform..

[21]  A. Godzik,et al.  Regularities in interaction patterns of globular proteins. , 1993, Protein engineering.

[22]  Mark Gerstein,et al.  Finding an Average Core Structure: Application to the Globins , 1994, ISMB.

[23]  William R Taylor,et al.  Toward the detection and validation of repeats in protein structure , 2004, Proteins.

[24]  P. McEwan,et al.  The leucine-rich repeat structure , 2008, Cellular and Molecular Life Sciences.

[25]  C Sander,et al.  Mapping the Protein Universe , 1996, Science.

[26]  P Fariselli,et al.  Progress in predicting inter‐residue contacts of proteins with neural networks and correlated mutations , 2001, Proteins.

[27]  Adam Godzik,et al.  Flexible structure alignment by chaining aligned fragment pairs allowing twists , 2003, ECCB.

[28]  S. Smerdon,et al.  The ankyrin repeat: a diversity of interactions on a common structural framework. , 1999, Trends in biochemical sciences.

[29]  B. Kobe,et al.  When protein folding is simplified to protein coiling: the continuum of solenoid protein structures. , 2000, Trends in biochemical sciences.

[30]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[31]  C. Sander,et al.  Protein structure comparison by alignment of distance matrices. , 1993, Journal of molecular biology.

[32]  Yuxiang Chen,et al.  PyTom: a python-based toolbox for localization of macromolecules in cryo-electron tomograms and subtomogram analysis. , 2012, Journal of structural biology.

[33]  Liisa Holm,et al.  Rapid automatic detection and alignment of repeats in protein sequences , 2000, Proteins.

[34]  Silvio C. E. Tosatto,et al.  REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform , 2009, Bioinform..

[35]  G. Crooks,et al.  WebLogo: a sequence logo generator. , 2004, Genome research.

[36]  Ana M. Rojas,et al.  The Nod-Like Receptor (NLR) Family: A Tale of Similarities and Differences , 2008, PloS one.

[37]  Silvio C. E. Tosatto,et al.  RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures , 2012, Bioinform..

[38]  B. Kobe,et al.  The leucine-rich repeat as a protein recognition motif. , 2001, Current opinion in structural biology.