Comparison of protein repeat classifications based on structure and sequence families.

Tandem repeats (TR) in proteins are common in nature and have several unique functions. They come in various forms that are frequently difficult to recognize from a sequence. A previously proposed structural classification has been recently implemented in the RepeatsDB database. This defines five main classes, mainly based on repeat unit length, with subclasses representing specific folds. Sequence-based classifications, such as Pfam, provide an alternative classification based on evolutionarily conserved repeat families. Here, we discuss a detailed comparison between the structural classes in RepeatsDB and the corresponding Pfam repeat families and clans. Most instances are found to map one-to-one between structure and sequence. Some notable exceptions such as leucine-rich repeats (LRRs) and α-solenoids are discussed.

[1]  B. Stoddard,et al.  Editorial: NAR Surveys the Past, Present and Future of Restriction Endonucleases , 2013, Nucleic acids research.

[2]  Markus G Grütter,et al.  New concepts and aids to facilitate crystallization. , 2013, Current opinion in structural biology.

[3]  Temple F. Smith,et al.  The WD repeat: a common architecture for diverse functions. , 1999, Trends in biochemical sciences.

[4]  Aleksandra M. Walczak,et al.  The Energy Landscapes of Repeat-Containing Proteins: Topology, Cooperativity, and the Folding Funnels of One-Dimensional Architectures , 2008, PLoS Comput. Biol..

[5]  Andrew D. Moore,et al.  Arrangements in the modular evolution of proteins. , 2008, Trends in biochemical sciences.

[6]  Silvio C. E. Tosatto,et al.  RepeatsDB: a database of tandem repeat protein structures , 2013, Nucleic Acids Res..

[7]  B. Kobe,et al.  The leucine-rich repeat as a protein recognition motif. , 2001, Current opinion in structural biology.

[8]  N. Karamanos,et al.  The Biology of Small Leucine-rich Proteoglycans in Bone Pathophysiology* , 2012, The Journal of Biological Chemistry.

[9]  J. Brinckmann Collagens at a Glance , 2005 .

[10]  T. Südhof,et al.  Leucine-Rich Repeat Transmembrane Proteins Are Essential for Maintenance of Long-Term Potentiation , 2013, Neuron.

[11]  Robert D. Finn,et al.  The challenge of increasing Pfam coverage of the human proteome , 2013, Database J. Biol. Databases Curation.

[12]  Arne Elofsson,et al.  Expansion of Protein Domain Repeats , 2006, PLoS Comput. Biol..

[13]  Silvio C. E. Tosatto,et al.  RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures , 2012, Bioinform..

[14]  Andrey V Kajava,et al.  Tandem repeats in proteins: from sequence to structure. , 2012, Journal of structural biology.

[15]  Jiri Stulik,et al.  Tetratricopeptide Repeat Motifs in the World of Bacterial Pathogens: Role in Virulence Mechanisms , 2012, Infection and Immunity.

[16]  R. Swerdlow,et al.  LRRK2, a puzzling protein: Insights into Parkinson's disease pathogenesis , 2014, Experimental Neurology.

[17]  S. Eddy,et al.  Pfam: the protein families database , 2013, Nucleic Acids Res..

[18]  Arne Elofsson,et al.  Nebulin: a study of protein repeat evolution. , 2010, Journal of molecular biology.

[19]  R. Xavier,et al.  Human leucine-rich repeat proteins: a genome-wide bioinformatic categorization and functional analysis in innate immunity , 2010, Proceedings of the National Academy of Sciences.

[20]  James E. Bray,et al.  The CATH database: an extended protein family resource for structural and functional genomics , 2003, Nucleic Acids Res..

[21]  Martin H. Schaefer,et al.  Functional and Genomic Analyses of Alpha-Solenoid Proteins , 2013, PloS one.

[22]  M. Hincke,et al.  Novel identification of matrix proteins involved in calcitic biomineralization. , 2015, Journal of proteomics.

[23]  Bin Xue,et al.  Protein tandem repeats – the more perfect, the less structured , 2010, The FEBS journal.

[24]  Robert D. Finn,et al.  Pfam: clans, web tools and services , 2005, Nucleic Acids Res..

[25]  D. Eisenberg,et al.  A census of protein repeats. , 1999, Journal of molecular biology.

[26]  S. Bordenstein,et al.  Tandem-repeat protein domains across the tree of life , 2015, PeerJ.

[27]  K. Arndt,et al.  Coiled Coil Domains: Stability, Specificity, and Biological Implications , 2004, Chembiochem : a European journal of chemical biology.

[28]  Silvio C. E. Tosatto,et al.  REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform , 2009, Bioinform..

[29]  R. Russell,et al.  WD40 proteins propel cellular networks. , 2010, Trends in biochemical sciences.

[30]  Johan Hofkens,et al.  Super-resolution optical DNA Mapping via DNA methyltransferase-directed click chemistry , 2014, Nucleic acids research.

[31]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[32]  Sean R. Eddy,et al.  Accelerated Profile HMM Searches , 2011, PLoS Comput. Biol..