An algorithm to find distant repeats in a pair of protein sequences

Distant repeats between a pair of protein sequences can be exploited to study the various aspects of proteins such as structure-function relationship, disorders due to protein malfunction, evolutionary analysis, etc. An in-depth analysis of the distant repeats would facilitate to establish a stable evolutionary relation of the repeats with respect to their three-dimensional structure. To this effect, an algorithm has been devised to identify the distant repeats in a pair of protein sequences by essentially using the scores of PAM (Percent Accepted Mutation) matrices. The proposed algorithm will be of much use to researchers involved in the comparative study of various organisms based on the amino-acid repeats in protein sequences.

[1]  John M. Hancock,et al.  The Comparative Genomics of Polyglutamine Repeats: Extreme Difference in the Codon Organization of Repeat-Encoding Regions Between Mammals and Drosophila , 2001, Journal of Molecular Evolution.

[2]  Kanagaraj Sekar,et al.  SMS: Sequence, Motif and Structure - A Database on the Structural Rigidity of Peptide Fragments in Non-Redundant Proteins , 2006, Silico Biol..

[3]  D. Engelman,et al.  The Affinity of GXXXG Motifs in Transmembrane Helix-Helix Interactions Is Modulated by Long-range Communication* , 2004, Journal of Biological Chemistry.

[4]  Dinesh Gupta,et al.  ProtRepeatsDB: a database of amino acid repeats in genomes , 2006, BMC Bioinformatics.

[5]  Paul Horton,et al.  Mitochondrial β-Barrel Proteins, an Exclusive Club? , 2008, Cell.

[6]  M. O. Dayhoff A model of evolutionary change in protein , 1978 .

[7]  J. Lolkema,et al.  Functional importance of GGXG sequence motifs in putative reentrant loops of 2HCT and ESS transport proteins. , 2009, Biochemistry.

[8]  S. Prusiner,et al.  Prions and prion proteins 1 , 1991, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[9]  H. Paulson,et al.  Polyglutamine disease and neuronal cell death. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[10]  C. Ponting,et al.  Homology-based method for identification of protein repeats using statistical significance estimates. , 2000, Journal of molecular biology.

[11]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[12]  B. Chesebro,et al.  Prion protein and the transmissible spongiform encephalopathies. , 1997, Trends in cell biology.

[13]  R. Guigó,et al.  Comparative analysis of amino acid repeats in rodents and humans. , 2004, Genome research.

[14]  Thomas Becker,et al.  Dissecting Membrane Insertion of Mitochondrial β-Barrel Proteins , 2008, Cell.

[15]  J. Whisstock,et al.  Functional insights from the distribution and role of homopeptide repeat-containing proteins. , 2005, Genome research.

[16]  Andrew R. Dalby,et al.  COPASAAR – A database for proteomic analysis of single amino acid repeats , 2005 .

[17]  M. V. Katti,et al.  Amino acid repeat patterns in protein sequences: Their diversity and structural‐functional implications , 2000, Protein science : a publication of the Protein Society.

[18]  K. Sekar,et al.  An algorithm to find all identical internal sequence repeats , 2008 .

[19]  Sukanta Mondal,et al.  THGS: a web-based database of Transmembrane Helices in Genome Sequences , 2004, Nucleic Acids Res..

[20]  S. Lehmann,et al.  Oxidative stress and the prion protein in transmissible spongiform encephalopathies , 2002, Brain Research Reviews.