Detection of protein repeats using the Ramanujan Filter Bank

Protein repeats are tandemly repeating segments within an amino acid sequence. They induce several important structural and binding properties on the protein. So far, the most successful detection schemes for such repeats have used computationally expensive techniques such as dynamic programming algorithms, HMMs, and so on. Classical DSP tools such as STFT, unfortunately, perform poorly in the presence of mutations. In this work, a novel technique is proposed based on the recently developed Ramanujan Filter Bank. Fast, accurate, and involving only simple integer computations, its performance is demonstrated on several well-known repeat families.1

[1]  J. Deisenhofer,et al.  A structural basis of the interactions between leucine-rich repeats and protein ligands , 1995, Nature.

[2]  B. Lee,et al.  The interpretation of protein structures: estimation of static accessibility. , 1971, Journal of molecular biology.

[3]  P. P. Vaidyanathan Ramanujan Sums in the Context of Signal Processing—Part II: FIR Representations and Applications , 2014, IEEE Transactions on Signal Processing.

[4]  Douglas C. Rees,et al.  A leucine-rich repeat variant with a novel repetitive protein structural motif , 1996, Nature Structural Biology.

[5]  Markus Gruber,et al.  REPPER—repeats and their periodicities in fibrous proteins , 2005, Nucleic Acids Res..

[6]  R. Grantham Amino Acid Difference Formula to Help Explain Protein Evolution , 1974, Science.

[7]  Denise Gorse,et al.  Wavelet transforms for the characterization and detection of repeating motifs. , 2002, Journal of molecular biology.

[8]  P. P. Vaidyanathan,et al.  Properties of Ramanujan filter banks , 2015, 2015 23rd European Signal Processing Conference (EUSIPCO).

[9]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[10]  Liisa Holm,et al.  Rapid automatic detection and alignment of repeats in protein sequences , 2000, Proteins.

[11]  Alexander S. Rose,et al.  NGL Viewer: a web application for molecular visualization , 2015, Nucleic Acids Res..

[12]  J Schultz,et al.  SMART, a simple modular architecture research tool: identification of signaling domains. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Brian A. Hemmings,et al.  The Structure of the Protein Phosphatase 2A PR65/A Subunit Reveals the Conformation of Its 15 Tandemly Repeated HEAT Motifs , 1999, Cell.

[14]  P. P. Vaidyanathan,et al.  Ramanujan filter banks for estimation and tracking of periodicities , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[15]  H. R. Faber,et al.  1.8 A crystal structure of the C-terminal domain of rabbit serum haemopexin. , 1995, Structure.

[16]  Andreas Prlic,et al.  Web-based molecular graphics for large complexes , 2016, Web3D.

[17]  Sean R. Eddy,et al.  Pfam: multiple sequence alignments and HMM-profiles of protein domains , 1998, Nucleic Acids Res..

[18]  D W Banner,et al.  Atomic coordinates for triose phosphate isomerase from chicken muscle. , 1976, Biochemical and biophysical research communications.

[19]  Peter Michaely,et al.  Crystal structure of a 12 ANK repeat stack from human ankyrinR , 2002, The EMBO journal.

[20]  D. Sidransky,et al.  Role of the p16 tumor suppressor gene in cancer. , 1998, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[21]  Anthony Maxwell,et al.  A Fluoroquinolone Resistance Protein from Mycobacterium tuberculosis That Mimics DNA , 2005, Science.

[22]  Michael A Kennedy,et al.  The 2A resolution crystal structure of HetL, a pentapeptide repeat protein involved in regulation of heterocyst differentiation in the cyanobacterium Nostoc sp. strain PCC 7120. , 2009, Journal of structural biology.

[23]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[24]  E. Gherardi,et al.  Crystal structure of an engineered YopM-InlB hybrid protein , 2014, BMC Structural Biology.

[25]  M. Neville,et al.  Identification and characterization of ANKK1: A novel kinase gene closely linked to DRD2 on chromosome band 11q23.1 , 2004, Human mutation.

[26]  H. Miyatake,et al.  Crystal Structure of Human Importin-α1 (Rch1), Revealing a Potential Autoinhibition Mode Involving Homodimerization , 2015, PloS one.

[27]  Jaap Heringa,et al.  Tracking repeats using significance and transitivity , 2004, ISMB/ECCB.

[28]  T. N. Bhat,et al.  The Protein Data Bank , 2000, Nucleic Acids Res..

[29]  P. P. Vaidyanathan,et al.  Nested Periodic Matrices and Dictionaries: New Signal Representations for Period Estimation , 2015, IEEE Transactions on Signal Processing.

[30]  P. P. Vaidyanathan,et al.  Ramanujan Sums in the Context of Signal Processing—Part I: Fundamentals , 2014, IEEE Transactions on Signal Processing.

[31]  R. Doolittle,et al.  A simple method for displaying the hydropathic character of a protein. , 1982, Journal of molecular biology.

[32]  P. P. Vaidyanathan,et al.  Detecting tandem repeats in DNA using Ramanujan Filter Bank , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[33]  I. Cosic Macromolecular bioactivity: is it resonant interaction between macromolecules?-theory and applications , 1994, IEEE Transactions on Biomedical Engineering.

[34]  C. Ponting,et al.  Protein repeats: structures, functions, and evolution. , 2001, Journal of structural biology.