Gene content and density in banana (Musa acuminata) as revealed by genomic sequencing of BAC clones

The complete sequence of Musa acuminata bacterial artificial chromosome (BAC) clones is presented and, consequently, the first analysis of the banana genome organization. One clone (MuH9) is 82,723 bp long with an overall G+C content of 38.2%. Twelve putative protein-coding sequences were identified, representing a gene density of one per 6.9 kb, which is slightly less than that previously reported for Arabidopsis but similar to rice. One coding sequence was identified as a partial M. acuminata malate synthase, while the remaining sequences showed a similarity to predicted or hypothetical proteins identified in genome sequence data. A second BAC clone (MuG9) is 73,268 bp long with an overall G+C content of 38.5%. Only seven putative coding regions were discovered, representing a gene density of only one gene per 10.5 kb, which is strikingly lower than that of the first BAC. One coding sequence showed significant homology to the soybean ribonucleotide reductase (large subunit). A transition point between coding regions and repeated sequences was found at approximately 45 kb, separating the coding upstream BAC end from its downstream end that mainly contained transposon-like sequences and regions similar to known repetitive sequences of M. acuminata. This gene organization resembles Gramineae genome sequences, where genes are clustered in gene-rich regions separated by gene-poor DNA containing abundant transposons.

[1]  Carol J Lentfer,et al.  Origins of Agriculture at Kuk Swamp in the Highlands of New Guinea , 2003, Science.

[2]  M. Walker-Simmons Protein kinases in seeds , 1998, Seed Science Research.

[3]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[4]  J. S. Heslop-Harrison,et al.  Integration of banana streak badnavirus into the Musa genome: molecular and cytogenetic evidence. , 1999, Virology.

[5]  R. Hull,et al.  Evidence that badnavirus infection in Musa can originate from integrated pararetroviral sequences. , 1999, Virology.

[6]  D. Petrov,et al.  Gene galaxies in the maize genome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  R. Swennen,et al.  Flow cytometric analysis of nuclear DNA content in Musa , 1999, Theoretical and Applied Genetics.

[8]  J. Salse,et al.  Synteny between Arabidopsis thaliana and rice at the genome level: a tool to identify conservation in the ongoing rice genome sequencing project. , 2002, Nucleic acids research.

[9]  T. Pryor,et al.  Genetic and molecular characterization of the maize rp3 rust resistance locus. , 2002, Genetics.

[10]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[11]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[12]  S. Salzberg,et al.  Sequence and analysis of the Arabidopsis genome. , 2001, Current opinion in plant biology.

[13]  V. Nigon,et al.  Genetic and Molecular Characterization , 1977 .

[14]  Ping Han,et al.  Malate synthase gene expression during fruit ripening of Cavendish banana (Musa acuminata cv. Williams). , 2003, Journal of experimental botany.

[15]  B. Roe,et al.  CHAPTER SIX – Shotgun Cloning as the Strategy of Choice to Generate Templates for High-throughput Dideoxynucleotide Sequencing , 1994 .

[16]  M. Borodovsky,et al.  GeneMark.hmm: new solutions for gene finding. , 1998, Nucleic acids research.

[17]  Jerzy Jurka,et al.  Censor - a Program for Identification and Elimination of Repetitive Elements From DNA Sequences , 1996, Comput. Chem..

[18]  Jonathan D. G. Jones,et al.  Novel Disease Resistance Specificities Result from Sequence Exchange between Tandemly Repeated Genes at the Cf-4/9 Locus of Tomato , 1997, Cell.

[19]  Yujun Zhang,et al.  Sequence and analysis of rice chromosome 4 , 2002, Nature.

[20]  G. Bernardi,et al.  Distribution of genes in the genome of Arabidopsis thaliana and its implications for the genome organization of plants. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[21]  B. Keller,et al.  Colinearity and gene density in grass genomes. , 2000, Trends in plant science.

[22]  M. Fontecave,et al.  An active ribonucleotide reductase from Arabidopsis thaliana cloning, expression and characterization of the large subunit. , 1999, European journal of biochemistry.

[23]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[24]  R. Katz,et al.  What is the role of the Cys‐his motif in retroviral nucleocapsid (NC) proteins? , 1989, BioEssays : news and reviews in molecular, cellular and developmental biology.

[25]  C. Teo,et al.  The cloning of Ty1-copia-like retrotransposons from 10 varieties of banana (Musa Sp.). , 2002, Journal of biochemistry, molecular biology, and biophysics : JBMBB : the official journal of the Federation of Asian and Oceanian Biochemists and Molecular Biologists.

[26]  The Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana , 2000, Nature.

[27]  David J. States,et al.  Identification of protein coding regions by database similarity search , 1993, Nature Genetics.

[28]  Toshihisa Takagi,et al.  DIGIT: A Novel Gene Finding Program by Combining Gene-Finders , 2002, Pacific Symposium on Biocomputing.

[29]  E. Delhaize,et al.  Aluminium tolerance in plants and the complexing role of organic acids. , 2001, Trends in plant science.

[30]  P. Reichard,et al.  Interactions between deoxyribonucleotide and DNA synthesis. , 1988, Annual review of biochemistry.

[31]  J. Mes,et al.  Six Homologs and One Active Gene Copy , 1998 .

[32]  Cari Soderlund,et al.  In-Depth View of Structure, Activity, and Evolution of Rice Chromosome 10 , 2003, Science.

[33]  G. May,et al.  Identification and chromosomal localization of the monkey retrotransposon in Musa sp. , 2000, Molecular and General Genetics MGG.

[34]  Huanming Yang,et al.  A draft sequence of the rice (Oryza sativa ssp.indica) genome , 2001, Chinese Science Bulletin.

[35]  L. Stein,et al.  Comparative genomics between rice and Arabidopsis shows scant collinearity in gene order. , 2001, Genome research.

[36]  J. Jurka Repbase update: a database and an electronic journal of repetitive elements. , 2000, Trends in genetics : TIG.

[37]  P. Becraft Receptor kinases in plant development , 1998 .

[38]  J. Bennetzen,et al.  Comparative Sequence Analysis of the Sorghum RphRegion and the Maize Rp1 Resistance Gene Complex , 2002, Plant Physiology.

[39]  S. Gowen Bananas and Plantains , 1995, World Crop Series.

[40]  Bruce A. Roe,et al.  DNA Isolation and Sequencing , 1996 .

[41]  J. Doležel,et al.  Isolation, characterization and chromosome localization of repetitive DNA sequences in bananas (Musa spp.) , 2004, Chromosome Research.

[42]  B. Gill,et al.  Analysis of 106 kb of contiguous DNA sequence from the D genome of wheat reveals high gene density and a complex arrangement of genes related to disease resistance. , 2002, Genome.

[43]  H. Fu,et al.  The highly recombinogenic bz locus lies in an unusually gene-rich region of the maize genome , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[44]  T. Gojobori,et al.  The genome sequence and structure of rice chromosome 1 , 2002, Nature.

[45]  G. Bernardi,et al.  The distribution of genes in the genomes of Gramineae. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[46]  Alex Bateman,et al.  The InterPro Database, 2003 brings increased coverage and new features , 2003, Nucleic Acids Res..