Efficient algorithms for identifying orthologous simple sequence repeats of disease genes

Dynamic mutations of simple sequence repeats (SSRs) have been demonstrated to affect normal gene function and cause different genetic disorders. Several conserved and even partial functional SSR patterns are discovered in inherited orthologous disease genes. To explore a wide range of SSRs in genetic diseases, a comprehensive system focusing on identifying orthologous SSRs of disease genes through a comparative genomics mechanism is constructed and accomplished by adopting online Mendelian inheritance in man (OMIM) and NCBI HomoloGene databases as the fundamental resources of human genetic diseases and homologous gene information. In addition, an efficient and effective algorithm for searching SSR patterns is also developed for providing annotated SSR information among various model species. By integrating these data resources and mining technologies, biologists and doctors can systematically retrieve novel and important conserved SSR information among orthologous disease genes. The proposed system, Orthologous SSR for Disease Genes (OSDG), is the first comprehensive framework for identifying orthologous SSRs as potential causative factors of genetic disorders and is freely available at http://osdg.cs.ntou.edu.tw/.

[1]  J. Jurka,et al.  Simple repetitive DNA sequences from primates: Compilation and analysis , 1995, Journal of Molecular Evolution.

[2]  Günter Kahl,et al.  Mining microsatellites in eukaryotic genomes. , 2007, Trends in biotechnology.

[3]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[4]  Dan Geiger,et al.  Finding approximate tandem repeats in genomic sequences. , 2005, Journal of computational biology : a journal of computational molecular cell biology.

[5]  E. Nevo,et al.  Microsatellites within genes: structure, function, and evolution. , 2004, Molecular biology and evolution.

[6]  M. MacDonald,et al.  Trinucleotide repeat length and progression of illness in Huntington's disease. , 1994, Journal of medical genetics.

[7]  Hampapathalu A. Nagarajaram,et al.  Genome analysis IMEx : Imperfect Microsatellite Extractor , 2007 .

[8]  G. Benson,et al.  Tandem repeats finder: a program to analyze DNA sequences. , 1999, Nucleic acids research.

[9]  R. Richards,et al.  Fragile X syndrome unstable element, p(CCG)n, and other simple tandem repeat sequences are binding sites for specific nuclear proteins. , 1993, Human molecular genetics.

[10]  Y. Kashi,et al.  Simple sequence repeats as advantageous mutators in evolution. , 2006, Trends in genetics : TIG.

[11]  Francis S. Collins,et al.  Genomic medicine--a primer. , 2002, The New England journal of medicine.

[12]  G. Singer,et al.  Nucleotide bias causes a genomewide bias in the amino acid composition of proteins. , 2000, Molecular biology and evolution.

[13]  V. McKusick Mendelian inheritance in man , 1971 .

[14]  E. Koonin,et al.  Orthology, paralogy and proposed classification for paralog subtypes. , 2002, Trends in genetics : TIG.

[15]  E. Sonnhammer,et al.  OrthoDisease: A database of human disease orthologs , 2004, Human mutation.

[16]  Gregory Kucherov,et al.  mreps: efficient and flexible detection of tandem repeats in DNA , 2003, Nucleic Acids Res..

[17]  Andrey Alexeyenko,et al.  Overview and comparison of ortholog databases. , 2006, Drug discovery today. Technologies.

[18]  Alan F. Scott,et al.  Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders , 2004, Nucleic Acids Res..

[19]  Walter Doerfler,et al.  On the function of the CGG-binding protein , 2001 .

[20]  J. Jurka,et al.  Microsatellites in different eukaryotic genomes: survey and analysis. , 2000, Genome research.

[21]  M. Perucho,et al.  Microsatellite instability: The mutator that mutates the other mutator , 1996, Nature Medicine.

[22]  Mireille Régnier,et al.  Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression , 2006, Bioinform..

[23]  Wolfgang Stephan,et al.  The evolutionary dynamics of repetitive DNA in eukaryotes , 1994, Nature.

[24]  R I Richards,et al.  Simple tandem DNA repeats and human genetic disease. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[25]  M. MacDonald,et al.  Huntington's disease: seeing the pathogenic process through a genetic lens. , 2006, Trends in biochemical sciences.

[26]  Madhusudhan W. Pandit,et al.  Triplet repeats in human genome: distribution and their association with genes and other genomic regions , 2003, Bioinform..

[27]  Robert Kofler,et al.  SciRoKo: a new tool for whole genome microsatellite search and investigation , 2007, Bioinform..

[28]  M. Hayden,et al.  The relationship between trinucleotide (CAG) repeat length and clinical features of Huntington's disease , 1993, Nature Genetics.

[29]  T. Boby,et al.  TRbase: a database relating tandem repeats to disease genes for the human genome , 2005, Bioinform..

[30]  Aleksandar Milosavljevic,et al.  Abundance and length of simple repeats in vertebrate genomes are determined by their structural properties. , 2008, Genome research.

[31]  Tun-Wen Pai,et al.  Advances and Applications in Bioinformatics and Chemistry Dovepress Open Access to Scientific and Medical Research Open Access Full Text Article an Online Conserved Ssr Discovery through Cross-species Comparison , 2022 .

[32]  H R Garner,et al.  Repeat polymorphisms within gene regions: phenotypic and evolutionary implications. , 2000, American journal of human genetics.

[33]  Chien-Ming Chen,et al.  Identification of Conserved Simple Sequence Repeats from Orthologous Disease Genes , 2009, BIOCOMP.

[34]  Filippo Aluffi-Pentini,et al.  STRING: finding tandem repeats in DNA sequences , 2003, Bioinform..

[35]  W. Speed,et al.  Short tandem repeat polymorphism evolution in humans , 1998, European Journal of Human Genetics.