Genome-wide computational approach for the prediction of duplications generating protein localization signals

Investigating the possible generation of motifs accountable for aberrant protein dislocation subsequent to the rise of short tandem duplications is interesting, given the pathogenic potential of this mechanism, as demonstrated in diseases such adult myeloid leukemia (AML). In this paper we introduce a new computational method for predicting genomic points which, after hypothetical mutation events such as micro-duplications, might encode molecular patterns such as localization or export signals. The proposed framework allows to study motifs of unconstrained length defined as regular expressions at a genome-wide level, providing an in silico platform capable of analyzing the potential effect of duplications on abnormal cellular localization.

[1]  Søren Brunak,et al.  NESbase version 1.0: a database of nuclear export signals , 2003, Nucleic Acids Res..

[2]  Amos Bairoch,et al.  The PROSITE database, its status in 1997 , 1997, Nucleic Acids Res..

[3]  D. Tautz,et al.  Slippage synthesis of simple sequence DNA. , 1992, Nucleic acids research.

[4]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[5]  C. E. Pearson,et al.  Repeat instability: mechanisms of dynamic mutations , 2005, Nature Reviews Genetics.

[6]  Paola Fazi,et al.  Cytoplasmic nucleophosmin in acute myelogenous leukemia with a normal karyotype. , 2005, The New England journal of medicine.

[7]  Udi Manber,et al.  Fast text searching: allowing errors , 1992, CACM.

[8]  Hans Ellegren Mismatch repair and mutational bias in microsatellite DNA. , 2002, Trends in genetics : TIG.

[9]  Brunangelo Falini,et al.  Acute myeloid leukemia carrying cytoplasmic/mutated nucleophosmin (NPMc+ AML): biologic and clinical features. , 2007, Blood.

[10]  Y. Kashi,et al.  Simple sequence repeats as advantageous mutators in evolution. , 2006, Trends in genetics : TIG.

[11]  B Falini,et al.  In human genome, generation of a nuclear export signal through duplication appears unique to nucleophosmin (NPM1) mutations and is restricted to AML , 2008, Leukemia.

[12]  Amos Bairoch,et al.  The PROSITE database, its status in 1999 , 1999, Nucleic Acids Res..

[13]  Philipp W. Messer,et al.  The majority of recent short DNA insertions in the human genome are tandem duplications. , 2007, Molecular biology and evolution.

[14]  Matthew S. Sachs,et al.  Early nonsense: mRNA decay solves a translational problem , 2006, Nature Reviews Molecular Cell Biology.

[15]  Eugene W. Myers,et al.  A Four Russians algorithm for regular expression pattern matching , 1992, JACM.

[16]  C. E. Pearson,et al.  Repeat instability as the basis for human diseases and as a potential target for therapy , 2010, Nature Reviews Molecular Cell Biology.

[17]  Gonzalo Navarro,et al.  Fast and Simple Character Classes and Bounded Gaps Pattern Matching, with Applications to Protein Searching , 2003, J. Comput. Biol..

[18]  H. Garner,et al.  Molecular origins of rapid and continuous morphological evolution , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Amos Bairoch,et al.  A Generalized Profile Syntax for Biomolecular Sequence Motifs and its Function in Automatic Sequence Interpretation , 1994, ISMB.

[20]  S. Carroll Endless Forms The Evolution of Gene Regulation and Morphological Diversity , 2000, Cell.

[21]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[22]  J. Bonner Why Size Matters , 2007 .

[23]  R. Tjian,et al.  Transcription regulation and animal diversity , 2003, Nature.

[24]  M. Webster,et al.  Is there evidence for convergent evolution around human microsatellites? , 2007, Molecular biology and evolution.

[25]  Isaac S. Kohane,et al.  DNA Dynamics Is Likely to Be a Factor in the Genomic Nucleotide Repeats Expansions Related to Diseases , 2011, PloS one.

[26]  Burkhard Rost,et al.  NLSdb: database of nuclear localization signals , 2003, Nucleic Acids Res..

[27]  H R Garner,et al.  Repeat polymorphisms within gene regions: phenotypic and evolutionary implications. , 2000, American journal of human genetics.