Mining EST databases to resolve evolutionary events in major crop species.

Using plant EST collections, we obtained 1392 potential gene duplicates across 8 plant species: Zea mays, Oryza sativa, Sorghum bicolor, Hordeum vulgare, Solanum tuberosum, Lycopersicon esculentum, Medicago truncatula, and Glycine max. We estimated the synonymous and nonsynonymous distances between each gene pair and identified two to three mixtures of normal distributions corresponding to one to three rounds of genome duplication in each species. Within the Poaceae, we found a conserved duplication event among all four species that occurred approximately 50-60 million years ago (Mya); an event that probably occurred before the major radiation of the grasses. In the Solanaceae, we found evidence for a conserved duplication event approximately 50-52 Mya. A duplication in soybean occurred approximately 44 Mya and a duplication in Medicago about 58 Mya. Comparing synonymous and nonsynonymous distances allowed us to determine that most duplicate gene pairs are under purifying, negative selection. We calculated Pearson's correlation coefficients to provide us with a measure of how gene expression patterns have changed between duplicate pairs, and compared this across evolutionary distances. This analysis showed that some duplicates seemed to retain expression patterns between pairs, whereas others showed uncorrelated expression.

[1]  T. J. Edwards Advances in Legume Systematics, Part 10: Higher Level Systematics , 2005 .

[2]  P. J. Maughan,et al.  Analysis of the barley and rice genomes by comparative RFLP linkage mapping , 1996, Theoretical and Applied Genetics.

[3]  G. S. Khush,et al.  Molecular mapping of rice chromosomes , 1988, Theoretical and Applied Genetics.

[4]  S. Tanksley,et al.  Majority of random cDNA clones correspond to single loci in the tomato genome , 1986, Molecular and General Genetics MGG.

[5]  M. Grandbastien,et al.  Two soybean ribulose-1,5-bisphosphate carboxylase small subunit genes share extensive homology even in distant flanking sequences , 2004, Plant Molecular Biology.

[6]  Jonathan F. Wendel,et al.  Genome evolution in polyploids , 2004, Plant Molecular Biology.

[7]  J. Doyle,et al.  Chloroplast-Expressed Glutamine Synthetase in Glycine and Related Leguminosae: Phylogeny, Gene Duplication, and Ancient Polyploidy , 2009 .

[8]  Klaas Vandepoele,et al.  Evidence That Rice and Other Cereals Are Ancient Aneuploids Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.014019. , 2003, The Plant Cell Online.

[9]  Abdelali Barakat,et al.  Comparative mapping between potato (Solanum tuberosum) and Arabidopsis thaliana reveals structurally conserved domains and ancient duplications in the potato genome. , 2003, The Plant journal : for cell and molecular biology.

[10]  G. Pertea,et al.  Comparative Analyses of Potato Expressed Sequence Tag Libraries1 , 2003, Plant Physiology.

[11]  V. Walbot,et al.  Progress in maize gene discovery: a project update , 2003, Functional & Integrative Genomics.

[12]  M. Feldman,et al.  The Impact of Polyploidy on Grass Genome Evolution , 2002, Plant Physiology.

[13]  G. Martin,et al.  Deductions about the Number, Organization, and Evolution of Genes in the Tomato Genome Based on Analysis of a Large Expressed Sequence Tag Collection and Selective Genomic Sequencing Article, publication date, and citation information can be found at www.plantcell.org/cgi/doi/10.1105/tpc.010478. , 2002, The Plant Cell Online.

[14]  H. Fu,et al.  Intraspecific violation of genetic colinearity and its implications in maize , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[15]  F. Ayala,et al.  A methodological bias toward overestimation of molecular evolutionary time scales , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[16]  J. Salse,et al.  Synteny between Arabidopsis thaliana and rice at the genome level: a tool to identify conservation in the ongoing rice genome sequencing project. , 2002, Nucleic acids research.

[17]  M. Dante,et al.  A compilation of soybean ESTs: generation and analysis. , 2002, Genome.

[18]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[19]  W. Michalek,et al.  EST analysis in barley defines a unigene set comprising 4,000 genes , 2002, Theoretical and Applied Genetics.

[20]  A. Oliphant,et al.  A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). , 2002, Science.

[21]  E. Koonin,et al.  Selection in the evolution of gene duplications , 2002, Genome Biology.

[22]  G. Wray Dating branches on the Tree of Life using DNA , 2001, Genome Biology.

[23]  J. M. Lee,et al.  Genome organization in dicots. II. Arabidopsis as a ’bridging species’ to resolve genome evolution events among legumes , 2001, Theoretical and Applied Genetics.

[24]  L. Duret,et al.  A Medicago truncatula homoglutathione synthetase is derived from glutathione synthetase by gene duplication. , 2001, Plant physiology.

[25]  Volker Brendel,et al.  Multi-query sequence BLAST output examination with MuSeqBox , 2001, Bioinform..

[26]  E. Kellogg,et al.  Evolutionary history of the grasses. , 2001, Plant physiology.

[27]  G. Glazko,et al.  Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Jerrold I. Davis,et al.  Phylogeny and subfamilial classification of the grasses (Poaceae) , 2001 .

[29]  D. Petrov Evolution of genome size: new approaches to an old problem. , 2001, Trends in genetics : TIG.

[30]  M. Lynch,et al.  The evolutionary fate and consequences of duplicate genes. , 2000, Science.

[31]  S. Tanksley,et al.  Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[32]  B. Gaut,et al.  Maize as a model for the evolution of plant nuclear genomes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[33]  I. Longden,et al.  EMBOSS: the European Molecular Biology Open Software Suite. , 2000, Trends in genetics : TIG.

[34]  C. Charon,et al.  Analysis of Medicago truncatula nodule expressed sequence tags. , 2000, Molecular plant-microbe interactions : MPMI.

[35]  John Quackenbush,et al.  The TIGR Gene Indices: reconstruction and representation of expressed gene sequences , 2000, Nucleic Acids Res..

[36]  R. Shoemaker,et al.  Mapping of duplicate genes in soybean , 1999 .

[37]  J. Claverie,et al.  Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. , 1999, Genome research.

[38]  X. Huang,et al.  CAP3: A DNA sequence assembly program. , 1999, Genome research.

[39]  A. Force,et al.  Preservation of duplicate genes by complementary, degenerative mutations. , 1999, Genetics.

[40]  Ziheng Yang,et al.  PAML: a program package for phylogenetic analysis by maximum likelihood , 1997, Comput. Appl. Biosci..

[41]  B. Gaut,et al.  DNA sequence evidence for the segmental allotetraploid origin of maize. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[42]  G. Drouin,et al.  Phylogeny and substitution rates of angiosperm actin genes. , 1996, Molecular biology and evolution.

[43]  M T Clegg,et al.  Substitution rate comparisons between grasses and palms: synonymous rate differences at the nuclear gene Adh parallel rate differences at the plastid gene rbcL. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[44]  R. Shoemaker,et al.  Genome duplication in soybean (Glycine subgenus soja). , 1996, Genetics.

[45]  F. Gruijl,et al.  Early p53 alterations in mouse skin carcinogenesis by UVB radiation: immunohistochemical detection of mutant p53 protein in clusters of preneoplastic epidermal cells. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[46]  F. B. Pickett,et al.  Seeing double: appreciating genetic redundancy. , 1995, The Plant cell.

[47]  M. Yano,et al.  Conservation of Duplicated Segments between Rice Chromosome 11 and 12 , 1995 .

[48]  T. Ohta Further examples of evolution by gene duplication revealed through DNA sequence comparisons. , 1994, Genetics.

[49]  N. Goldman,et al.  A codon-based model of nucleotide substitution for protein-coding DNA sequences. , 1994, Molecular biology and evolution.

[50]  S. Tanksley,et al.  Comparative linkage maps of the rice and maize genomes. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[51]  J. Doebley,et al.  Comparative genome mapping of Sorghum and maize. , 1992, Genetics.

[52]  G. Martin,et al.  High density molecular linkage maps of the tomato and potato genomes. , 1992, Genetics.

[53]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[54]  B. Scallon,et al.  Characterization of the glycinin gene family in soybean. , 1989, The Plant cell.

[55]  S. Tanksley,et al.  RFLP Maps Based on a Common Set of Clones Reveal Modes of Chromosomal Evolution in Potato and Tomato. , 1988, Genetics.

[56]  T. Helentjaris,et al.  Identification of the genomic locations of duplicate nucleotide sequences in maize by analysis of restriction fragment length polymorphisms. , 1988, Genetics.

[57]  R. Meagher,et al.  Divergence and differential expression of soybean actin genes. , 1985, The EMBO journal.

[58]  J. S. Lee,et al.  Chromosomal arrangement of leghemoglobin genes in soybean. , 1983, Nucleic acids research.

[59]  Peter H. Raven,et al.  Advances in legume systematics , 1981 .

[60]  K. Gardens Advances in legume systematics , 1981 .

[61]  Dr. Susumu Ohno Evolution by Gene Duplication , 1970, Springer Berlin Heidelberg.

[62]  G. Stebbins Chromosomal variation and evolution. , 1966, Science.

[63]  O. Witte,et al.  Stomatal Size in Fossil Plants : Evidence for Polyploidy in Majority of Angiosperms , 2022 .