First Exons and Introns - A Survey of GC Content and Gene Structure in the Human Genome

Most transcriptional regulatory elements are located in non-coding DNA. In particular, some first introns play a vital role in transcriptional control and splicing. The length and GC-content of first exons and introns in complex organisms suggests that these structural units are likely to be important functional elements in large genomes. Hence, in this paper we perform a systematic comparison of exon-intron structure and GC content on all known genes in the human genome. Our in-silico analysis found that the GC content of introns and exons varies significantly depending on their length. On average, the first intron of a gene is significantly longer than other introns in the same gene. Our results also show that first introns and exons are more GC rich than last and internal. This study provides insight into the structure of eukaryotic genes. These results confirm and expand the previously identified regulatory potential of first exons and introns.

[1]  J. M. Comeron,et al.  Selective and Mutational Patterns Associated With Gene Expression in Humans , 2004, Genetics.

[2]  Damian Smedley,et al.  Ensembl 2004 , 2004, Nucleic Acids Res..

[3]  S. Ohno,et al.  So much "junk" DNA in our genome. , 1972, Brookhaven symposia in biology.

[4]  J. Hawkins,et al.  A survey on intron and exon lengths. , 1988, Nucleic acids research.

[5]  Michael Q. Zhang,et al.  Computational identification of promoters and first exons in the human genome , 2001, Nature Genetics.

[6]  Stephen M. Mount,et al.  Splicing signals in Drosophila: intron size, information content, and consensus sequences. , 1992, Nucleic acids research.

[7]  Jurg Ott,et al.  Distribution and characterization of regulatory elements in the human genome. , 2002, Genome research.

[8]  Meena Kishore Sakharkar,et al.  Distributions of exons and introns in the human genome , 2004, Silico Biol..

[9]  J. M. Comeron,et al.  What controls the length of noncoding DNA? , 2001, Current opinion in genetics & development.

[10]  Meena Kishore Sakharkar,et al.  An analysis on gene architecture in human and mouse genomes , 2005, Silico Biol..

[11]  M. Long,et al.  Intron-exon structures of eukaryotic model organisms. , 1999, Nucleic acids research.

[12]  J. Oliver,et al.  A relationship between GC content and coding-sequence length , 1996, Journal of Molecular Evolution.

[13]  J. Hawkins A survey on intron and exon lengths. , 1988, Nucleic acids research.

[14]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[15]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[16]  Marie-Paule Lefranc,et al.  Influence of Intron Length on Alternative Splicing of CD44 , 1998, Molecular and Cellular Biology.

[17]  W. Gilbert,et al.  How big is the universe of exons? , 1990, Science.

[18]  A. Vinogradov Intron–Genome Size Relationship on a Large Evolutionary Scale , 1999, Journal of Molecular Evolution.

[19]  M S Gelfand,et al.  Statistical analysis of the exon-intron structure of higher and lower eukaryote genes. , 1999, Journal of biomolecular structure & dynamics.

[20]  Samuel Karlin,et al.  Genes, pseudogenes, and Alu sequence organization across human chromosomes 21 and 22 , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[21]  John M Logsdon,et al.  The recent origins of introns , 1992, Current Biology.

[22]  H. Klamut,et al.  Identification of a transcriptional enhancer within muscle intron 1 of the human dystrophin gene. , 1996, Human molecular genetics.