Functional and evolutionary analyses on expressed intronless genes in the mouse genome

Using computational approaches we have identified 2017 expressed intronless genes in the mouse genome. Evolutionary analysis reveals that 56 intronless genes are conserved among the three domains of life – bacteria, archea and eukaryotes. These highly conserved intronless genes were found to be involved in essential housekeeping functions. About 80% of expressed mouse intronless genes have orthologs in eukaryotic genomes only, and thus are specific to eukaryotic organisms. 608 of these genes have intronless human orthologs and 302 of these orthologs have a match in OMIM database. Investigation into these mouse genes will be important in generating mouse models for understanding human diseases.

[1]  A Nava,et al.  Characterization of C14orf4, a novel intronless human gene containing a polyglutamine repeat, mapped to the ARVD1 critical region. , 2000, Biochemical and biophysical research communications.

[2]  Paul Shapshak,et al.  A report on single exon genes (SEG) in eukaryotes. , 2004, Frontiers in bioscience : a journal and virtual library.

[3]  E. Koonin,et al.  Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. , 2002, Genome research.

[4]  Mark Gerstein,et al.  Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. , 2002, Genome research.

[5]  L. Orgel,et al.  Biochemical Evolution , 1971, Nature.

[6]  S. Mitaku,et al.  Identification of G protein‐coupled receptor genes from the human genome sequence , 2002, FEBS letters.

[7]  S. Firestein,et al.  The olfactory receptor gene superfamily of the mouse , 2002, Nature Neuroscience.

[8]  L. Pachter,et al.  Strategies and tools for whole-genome alignments. , 2002, Genome research.

[9]  R. Rozmahel,et al.  Human dopamine D1 receptor encoded by an intronless gene on chromosome 5 , 1990, Nature.

[10]  John B. Anderson,et al.  CDD: a Conserved Domain Database for protein classification , 2004, Nucleic Acids Res..

[11]  Ryan D. Morin,et al.  The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). , 2004, Genome research.

[12]  J. Brosius,et al.  Retroposons--seeds of evolution. , 1991, Science.

[13]  Colin N. Dewey,et al.  Initial sequencing and comparative analysis of the mouse genome. , 2002 .

[14]  J. Brosius,et al.  Echoes from the past – are we still in an RNP world? , 2005, Cytogenetic and Genome Research.

[15]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[16]  Mouse Genome Sequencing Consortium Initial sequencing and comparative analysis of the mouse genome , 2002, Nature.

[17]  C. A. Andersen,et al.  Prediction of human protein function from post-translational modifications and localization features. , 2002, Journal of molecular biology.

[18]  J. Venables Aberrant and Alternative Splicing in Cancer , 2004, Cancer Research.

[19]  Adam Godzik,et al.  Clustering of highly homologous sequences to reduce the size of large protein databases , 2001, Bioinform..

[20]  Meena Kishore Sakharkar,et al.  Genome SEGE: A database for 'intronless' genes in eukaryotic genomes , 2004, BMC Bioinformatics.

[21]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[22]  A. Sands,et al.  The master mammal , 2003, Nature Biotechnology.

[23]  S Karlin,et al.  Why are human G-protein-coupled receptors predominantly intronless? , 1999, Trends in genetics : TIG.