Identification of Horizontally-transferred Genomic Islands and Genome Segmentation Points by Using the GC Profile Method

The nucleotide composition of genomes undergoes dramatic variations among all three kingdoms of life. GC content, an important characteristic for a genome, is related to many important functions, and therefore GC content and its distribution are routinely reported for sequenced genomes. Traditionally, GC content distribution is assessed by computing GC contents in windows that slide along the genome. Disadvantages of this routinely used window-based method include low resolution and low sensitivity. Additionally, different window sizes result in different GC content distribution patterns within the same genome. We proposed a windowless method, the GC profile, for displaying GC content variations across the genome. Compared to the window-based method, the GC profile has the following advantages: 1) higher sensitivity, because of variation-amplifying procedures; 2) higher resolution, because boundaries between domains can be determined at one single base pair; 3) uniqueness, because the GC profile is unique for a given genome and 4) the capacity to show both global and regional GC content distributions. These characteristics are useful in identifying horizontally-transferred genomic islands and homogenous GC-content domains. Here, we review the applications of the GC profile in identifying genomic islands and genome segmentation points, and in serving as a platform to integrate with other algorithms for genome analysis. A web server generating GC profiles and implementing relevant genome segmentation algorithms is available at: www.zcurve.net.

[1]  N. Sueoka On the genetic basis of variation and heterogeneity of DNA base composition. , 1962, Proceedings of the National Academy of Sciences of the United States of America.

[2]  G Bernardi,et al.  An approach to the organization of eukaryotic genomes at a macromolecular level. , 1976, Journal of molecular biology.

[3]  G Bernardi,et al.  The distribution of interspersed repeats is nonuniform and conserved in the mouse and human genomes. , 1983, Proceedings of the National Academy of Sciences of the United States of America.

[4]  C. Tacket,et al.  Clinical features and an epidemiological study of Vibrio vulnificus infections. , 1984, The Journal of infectious diseases.

[5]  R Zhang,et al.  Analysis of distribution of bases in the coding sequences by a diagrammatic technique. , 1991, Nucleic acids research.

[6]  G. Bernardi,et al.  The isochore organization of the human genome and its evolutionary history--a review. , 1993, Gene.

[7]  R Zhang,et al.  Z curves, an intutive tool for visualizing and analyzing the DNA sequences. , 1994, Journal of biomolecular structure & dynamics.

[8]  G Bernardi,et al.  The gene distribution of the human genome. , 1996, Gene.

[9]  G Bernardi,et al.  Methylation patterns in the isochores of vertebrate genomes. , 1997, Gene.

[10]  T Gojobori,et al.  Precise switching of DNA replication timing in the GC content transition area in the human major histocompatibility complex , 1997, Molecular and cellular biology.

[11]  S. Salzberg,et al.  Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima , 1999, Nature.

[12]  A. Smit Interspersed repeats and other mementos of transposable elements in mammalian genomes. , 1999, Current opinion in genetics & development.

[13]  M. Strom,et al.  Epidemiology and pathogenesis of Vibrio vulnificus. , 2000, Microbes and infection.

[14]  J. Hacker,et al.  Pathogenicity islands and the evolution of microbes. , 2000, Annual review of microbiology.

[15]  J Hacker,et al.  Common molecular mechanisms of symbiosis and pathogenesis. , 2000, Trends in microbiology.

[16]  H. Ochman,et al.  Lateral gene transfer and the nature of bacterial innovation , 2000, Nature.

[17]  W Krone,et al.  An isochore transition in the NF1 gene region coincides with a switch in the extent of linkage disequilibrium. , 2000, American journal of human genetics.

[18]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[19]  J Hacker,et al.  Pathogenicity islands: the tip of the iceberg. , 2001, Microbes and infection.

[20]  R Zhang,et al.  A Novel Method to Calculate the G+C Content of Genomic DNA Sequences , 2001, Journal of biomolecular structure & dynamics.

[21]  Vincent A. Fischetti,et al.  Rapid Killing of Streptococcus pneumoniae with a Bacteriophage Cell Wall Hydrolase , 2001, Science.

[22]  H. Ochman,et al.  Lateral and oblique gene transfer. , 2001, Current opinion in genetics & development.

[23]  G Bernardi,et al.  Misunderstandings about isochores. Part 1. , 2001, Gene.

[24]  E. Kimura,et al.  Corynebacterium efficiens sp. nov., a glutamic-acid-producing species from soil and vegetables. , 2002, International journal of systematic and evolutionary microbiology.

[25]  E. Kimura,et al.  Comparative complete genome sequence analysis of the amino acid replacements responsible for the thermostability of Corynebacterium efficiens. , 2003, Genome research.

[26]  Ren Zhang,et al.  Identification of genomic islands in the genome of Bacillus cereus by comparative analysis with Bacillus anthracis. , 2003, Physiological genomics.

[27]  Ren Zhang,et al.  A systematic method to identify genomic islands and its applications in analyzing the genomes of Corynebacterium glutamicum and Vibrio vulnificus CMCP6 chromosome I , 2004, Bioinform..

[28]  Ren Zhang,et al.  Genomic islands in Rhodopseudomonas palustris , 2004, Nature Biotechnology.

[29]  S. Udaka,et al.  Studies on the amino acid fermentation. Part 1. Production of L-glutamic acid by various microorganisms. , 2004, The Journal of general and applied microbiology.

[30]  L. Duret,et al.  Statistical analysis of vertebrate sequences reveals that long genes are scarce in GC-rich isochores , 1995, Journal of Molecular Evolution.

[31]  Caroline Peres,et al.  Complete genome sequence of the metabolically versatile photosynthetic bacterium Rhodopseudomonas palustris , 2004, Nature Biotechnology.

[32]  Ren Zhang,et al.  A nucleotide composition constraint of genome sequences , 2004, Comput. Biol. Chem..

[33]  Feng Gao,et al.  Segmentation algorithm for DNA sequences. , 2005, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  Ren Zhang,et al.  Genomic Islands in the Corynebacterium efficiens Genome , 2005, Applied and Environmental Microbiology.

[35]  Feng Gao,et al.  GC-Profile: a web-based tool for visualizing and analyzing the variation of GC content in genomic sequences , 2006, Nucleic Acids Res..

[36]  Kumar Rajakumar,et al.  A novel strategy for the identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites in closely related bacteria , 2006, Nucleic acids research.

[37]  Stephen Lory,et al.  MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands , 2007, Nucleic Acids Res..

[38]  S. Tsai,et al.  Characterization of Integrative and Conjugative Element ICEKp1-Associated Genomic Heterogeneity in a Klebsiella pneumoniae Strain Isolated from a Primary Liver Abscess , 2007, Journal of bacteriology.

[39]  Ren Zhang,et al.  Accurate Localization of the Integration Sites of Two Genomic Islands at Single-Nucleotide Resolution in the Genome of Bacillus cereus ATCC 10987 , 2008, Comparative and functional genomics.

[40]  N. Perna,et al.  progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement , 2010, PloS one.

[41]  Zhen Xu,et al.  ICEberg: a web-based resource for integrative and conjugative elements found in Bacteria , 2011, Nucleic Acids Res..

[42]  Z. Deng,et al.  Complete Genome Sequence of Klebsiella pneumoniae subsp. pneumoniae HS11286, a Multidrug-Resistant Strain Isolated from Human Sputum , 2012, Journal of bacteriology.

[43]  F. Guo,et al.  Prediction of Genomic Islands in Three Bacterial Pathogens of Pneumonia , 2012, International journal of molecular sciences.

[44]  Zixin Deng,et al.  SecReT4: a web-based bacterial type IV secretion system resource , 2012, Nucleic Acids Res..