IRIS: Internal Repeat Identification System

Protein repeats are considered as a significant role in protein function analysis and structural evolution. About 25% of all proteins contain repeat structures for eukaryote species and most of them do not have the resolved structural information yet. Therefore, this study aimed to identify internal repeats from soely protein sequences information. Traditional detection methods exploited amino acid sequence alignment to detect protein repeats from protein sequences, but the performance was satisfied limited to high sequence similarity. In this study, a novel method was proposed based on the predicted secondary structure element information. Sequences were firstly transformed into Length Encoded Secondary Structure (LESS) profiles and followed by autocorrelation analyses. From the primary experimental results, the developed Internal Repeat Identification System (IRIS) can successfully identify internal repeats from some well-known preotein structures, such as Tata-box Binding protein of Sulfolobus acidocaldarius and porcine ribonuclease inhibitor complexed with the ribonuclease.

[1]  Tun-Wen Pai,et al.  Advances and Applications in Bioinformatics and Chemistry Dovepress Open Access to Scientific and Medical Research Open Access Full Text Article an Online Conserved Ssr Discovery through Cross-species Comparison , 2022 .

[2]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[3]  Liisa Holm,et al.  Rapid automatic detection and alignment of repeats in protein sequences , 2000, Proteins.

[4]  Joël Pothier,et al.  Swelfe: a detector of internal repeats in sequences and structures , 2008, Bioinform..

[5]  Tim J. P. Hubbard,et al.  Data growth and its impact on the SCOP database: new developments , 2007, Nucleic Acids Res..

[6]  Frédéric Boyer,et al.  BIOINFORMATICS APPLICATIONS NOTE , 2005 .

[7]  C. Ponting,et al.  Protein repeats: structures, functions, and evolution. , 2001, Journal of structural biology.

[8]  L. Clowney,et al.  Origins of protein stability revealed by comparing crystal structures of TATA binding proteins. , 2004, Structure.

[9]  L. Vitagliano,et al.  The crystal structure of a tetrameric hemoglobin in a partial hemichrome state , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[10]  J. Deisenhofer,et al.  The leucine-rich repeat: a versatile binding motif. , 1994, Trends in biochemical sciences.

[11]  Zongchao Jia,et al.  Mimicry of ice structure by surface hydroxyls and water of a β-helix antifreeze protein , 2000, Nature.

[12]  Pierre Baldi,et al.  SCRATCH: a protein structure and structural feature prediction server , 2005, Nucleic Acids Res..

[13]  D. Eisenberg,et al.  A census of protein repeats. , 1999, Journal of molecular biology.

[14]  J Heringa,et al.  The REPRO server: finding protein internal sequence repeats through the Web. , 2000, Trends in biochemical sciences.

[15]  Stefan Kurtz,et al.  REPuter: fast computation of maximal repeats in complete genomes , 1999, Bioinform..