Gene number in an invertebrate chordate, Ciona intestinalis.

Gene number can be considered a pragmatic measure of biological complexity, but reliable data is scarce. Estimates for vertebrates are 50-100,000 genes per haploid genome, whereas invertebrate estimates fall below 25,000. We wished to test the hypothesis that the origin of vertebrates coincided with extensive gene creation. A prediction is that gene number will differ sharply between invertebrate and vertebrate members of the chordate phylum. A gene number estimation method requiring limited sequence sampling of genomic DNA was developed and validated by using data for Caenorhabditis elegans. Using the method, we estimated that the invertebrate chordate Ciona intestinalis has 15,500 protein-coding genes (+/-3,700). This number is significantly lower than gene numbers of vertebrate chordates, but similar to those of invertebrates in distantly related phyla. The data indicate that evolution of vertebrates was accompanied by a dramatic increase in protein-coding capacity of the genome.

[1]  G. Elgar,et al.  Quality not quantity: the pufferfish genome. , 1996, Human molecular genetics.

[2]  A. Bird,et al.  Gene number, noise reduction and biological complexity. , 1995, Trends in genetics : TIG.

[3]  W. Jeffery,et al.  Chasing tails in ascidians: developmental insights into the origin and evolution of chordates. , 1995, Trends in genetics : TIG.

[4]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[5]  A. Bird,et al.  Number of CpG islands and genes in human and mouse. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Rolf Apweiler,et al.  The SWISS-PROT protein sequence data bank and its new supplement TREMBL , 1996, Nucleic Acids Res..

[7]  J. Sulston,et al.  The genome of Caenorhabditis elegans. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Erik L. L. Sonnhammer,et al.  A workbench for large-scale sequence homology analysis , 1994, Comput. Appl. Biosci..

[9]  S. Brenner,et al.  Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome , 1993, Nature.

[10]  B. Barrell,et al.  Life with 6000 Genes , 1996, Science.

[11]  F. Hartl,et al.  Recombination of protein domains facilitated by co-translational folding in eukaryotes , 1997, Nature.

[12]  P. Deloukas,et al.  A Gene Map of the Human Genome , 1996, Science.

[13]  T. Cavalier-smith The Evolution of genome size , 1985 .

[14]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[15]  R. Fleischmann,et al.  Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. , 1995, Science.

[16]  M. Adams,et al.  How many genes in the human genome? , 1994, Nature Genetics.

[17]  P. Chomczyński,et al.  Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. , 1987, Analytical biochemistry.

[18]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[19]  N. Halloran,et al.  A survey of expressed genes in Caenorhabditis elegans , 1992, Nature Genetics.

[20]  Graziano Pesole,et al.  Databases of MRNA Untranslated Regions for Metazoa , 1996, Comput. Chem..

[21]  G. Rubin,et al.  The Role of the Genome Project in Determining Gene Function: Insights from Model Organisms , 1996, Cell.

[22]  A. Bird,et al.  Transcriptional noise and the evolution of gene number. , 1995, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.