NotI flanking sequences: a tool for gene discovery and verification of the human genome.

A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celera's database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity > or =90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 000-20 000 NotI sites, of which 6000-9000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.

[1]  E. Zabarovsky,et al.  Construction of a human chromosome 3 specific NotI linking library using a novel cloning procedure. , 1990, Nucleic acids research.

[2]  H. Prydz,et al.  Choice of enzymes for mapping based on CpG islands in the human genome. , 1992, Genetic analysis, techniques and applications.

[3]  A. Protopopov,et al.  hUNC93B1: a novel human gene representing a new gene family and encoding an unc-93-like protein. , 2002, Gene.

[4]  A. Bird CpG islands as gene markers in the vertebrate nucleus , 1987 .

[5]  Timothy B. Stockwell,et al.  The Sequence of the Human Genome , 2001, Science.

[6]  N. Kakazu,et al.  A complete Not I restriction map covering the entire long arm of human chromosome 11 , 1997, Genes to cells : devoted to molecular & cellular mechanisms.

[7]  A. Protopopov,et al.  NotI clones in the analysis of the human genome. , 2000, Nucleic acids research.

[8]  E. Sonnhammer,et al.  Assignment1 of the GPR14 gene coding for the G-protein-coupled receptor 14 to human chromosome 17q25.3 by fluorescent in situ hybridization , 2000, Cytogenetic and Genome Research.

[9]  A. Protopopov,et al.  Assignment1 of CDK5R2 coding for the cyclin-dependent kinase 5, regulatory subunit 2 (NCK5AI protein) to human chromosome band 2q35 by fluorescent in situ hybridization , 2000, Cytogenetic and Genome Research.

[10]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[11]  K. Kok,et al.  A homozygous deletion in a small cell lung cancer cell line involving a 3p21 region with a marked instability in yeast artificial chromosomes. , 1994, Cancer research.

[12]  M. Frommer,et al.  CpG islands in vertebrate genomes. , 1987, Journal of molecular biology.

[13]  F. Hosoda,et al.  A Notl restriction map of the entire long arm of human chromosome 21 , 1993, Nature Genetics.

[14]  R. Durbin,et al.  Analysis of protein domain families in Caenorhabditis elegans. , 1997, Genomics.

[15]  A. Protopopov,et al.  Isolation and chromosomal localization of a new human retinoblastoma binding protein 2 homologue 1a (RBBP2H1A) , 2000, European Journal of Human Genetics.

[16]  E. Zabarovsky,et al.  Shot-gun sequencing strategy for long-range genome mapping: a pilot study. , 1994, Genomics.

[17]  A. Protopopov,et al.  Human SSI3 gene Map position 17q25.3 , 2004, Chromosome Research.

[18]  C. Cantor,et al.  Construction and characterization of a NotI linking library of human chromosome 21. , 1991, Genomics.

[19]  G. Winberg,et al.  A group of NotI jumping and linking clones cover 2.5 Mb in the 3p21-p22 region suspected to contain a tumor suppressor gene. , 1995, Cancer genetics and cytogenetics.

[20]  S. Weissman,et al.  Construction and characterization of a Notl-BsuE linking library from the human X chromosome , 1991 .

[21]  A. Protopopov,et al.  Human NRG3 gene Map position 10q22-q23 , 2004, Chromosome Research.

[22]  I. Ernberg,et al.  The role of methylation in the phenotype-dependent modulation of Epstein-Barr nuclear antigen 2 and latent membrane protein genes in cells latently infected with Epstein-Barr virus. , 1989, The Journal of general virology.

[23]  K. Montgomery,et al.  A second-generation YAC contig map of human chromosome 12. , 1995, Nature.

[24]  J. Minna,et al.  Construction of a 600-kilobase cosmid clone contig and generation of a transcriptional map surrounding the lung cancer tumor suppressor gene (TSG) locus on human chromosome 3p21.3: progress toward the isolation of a lung cancer TSG. , 1996, Cancer research.

[25]  M. Hattori,et al.  The DNA sequence of human chromosome 21 , 2000, Nature.

[26]  E. Zabarovsky,et al.  New strategy for mapping the human genome based on a novel procedure for construction of jumping libraries. , 1991, Genomics.

[27]  E. Zabarovsky,et al.  Cloning of two candidate tumor suppressor genes within a 10 kb region on chromosome 13q14, frequently deleted in chronic lymphocytic leukemia , 1997, Oncogene.

[28]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[29]  Michael J. Stanhope,et al.  Phylogenetic analyses do not support horizontal gene transfers from bacteria to vertebrates , 2001, Nature.

[30]  Melanie E. Goward,et al.  The DNA sequence of human chromosome 22 , 1999, Nature.

[31]  G. Winberg,et al.  NotI linking clones as a tool for joining physical and genetic maps of the human genome. , 1994, Genomics.

[32]  H. Prydz,et al.  CpG islands as gene markers in the human genome. , 1992, Genomics.

[33]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[34]  Victor G. Levitsky,et al.  Nucleosome formation potential of eukaryotic DNA: calculation and promoters analysis , 2001, Bioinform..

[35]  A. Protopopov,et al.  Analysis of NotI linking clones isolated from human chromosome 3 specific libraries. , 1999, Gene.