First pass annotation of promoters on human chromosome 22.

The publication of the first almost complete sequence of a human chromosome (chromosome 22) is a major milestone in human genomics. Together with the sequence, an excellent annotation of genes was published which certainly will serve as an information resource for numerous future projects. We noted that the annotation did not cover regulatory regions; in particular, no promoter annotation has been provided. Here we present an analysis of the complete published chromosome 22 sequence for promoters. A recent breakthrough in specific in silico prediction of promoter regions enabled us to attempt large-scale prediction of promoter regions on chromosome 22. Scanning of sequence databases revealed only 20 experimentally verified promoters, of which 10 were correctly predicted by our approach. Nearly 40% of our 465 predicted promoter regions are supported by the currently available gene annotation. Promoter finding also provides a biologically meaningful method for "chromosomal scaffolding", by which long genomic sequences can be divided into segments starting with a gene. As one example, the combination of promoter region prediction with exon/intron structure predictions greatly enhances the specificity of de novo gene finding. The present study demonstrates that it is possible to identify promoters in silico on the chromosomal level with sufficient reliability for experimental planning and indicates that a wealth of information about regulatory regions can be extracted from current large-scale (megabase) sequencing projects. Results are available on-line at http://genomatix.gsf.de/chr22/.

[1]  M. Hattori,et al.  The DNA sequence of human chromosome 21 , 2000, Nature.

[2]  S. Lewis,et al.  Genome annotation assessment in Drosophila melanogaster. , 2000, Genome research.

[3]  T. Werner,et al.  Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach. , 2000, Journal of molecular biology.

[4]  Melanie E. Goward,et al.  The DNA sequence of human chromosome 22 , 1999, Nature.

[5]  B. Kennedy,et al.  The human calcium-independent phospholipase A2 gene multiple enzymes with distinct properties from a single gene. , 1999, European journal of biochemistry.

[6]  E. Wingender,et al.  Recognition of NFATp/AP-1 composite elements within genes induced upon the activation of immune cells. , 1999, Journal of molecular biology.

[7]  Thomas Werner,et al.  Functional promoter modules can be detected by formal models independent of overall nucleotide sequence similarity , 1999, Bioinform..

[8]  T. Werner Models for prediction and recognition of eukaryotic promoters , 1999, Mammalian Genome.

[9]  Philipp Bucher,et al.  The Eukaryotic Promoter Database EPD , 1998, Nucleic Acids Res..

[10]  Thomas Werner,et al.  Muscle actin genes: A first step towards computational classification of tissue specific promoters , 1998, Silico Biol..

[11]  J. Fickett,et al.  Eukaryotic promoter recognition. , 1997, Genome research.

[12]  T. Werner,et al.  A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter. , 1997, Journal of molecular biology.

[13]  Victor V. Solovyev,et al.  The Gene-Finder Computer Tools for Analysis of Human and Model Organisms Genome Sequences , 1997, ISMB.

[14]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[15]  A. Dress,et al.  Multiple DNA and protein sequence alignment based on segment-to-segment comparison. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[16]  T. Werner,et al.  GenomeInspector: basic software tools for analysis of spatial correlations between genomic structures within megabase sequences. , 1996, Genomics.

[17]  B. Roe,et al.  Localization of the human mitochondrial citrate transporter protein gene to chromosome 22Q11 in the DiGeorge syndrome critical region. , 1995, Genomics.

[18]  S. Cross,et al.  CpG islands and genes. , 1995, Current opinion in genetics & development.

[19]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[20]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.