Analysis of the genome sequence of the flowering plant Arabidopsis thaliana

The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene transfer from a cyanobacterial-like ancestor of the plastid. The genome contains 25,498 genes encoding proteins from 11,000 families, similar to the functional diversity of Drosophila and Caenorhabditis elegans— the other sequenced multicellular eukaryotes. Arabidopsis has many families of new proteins but also lacks several common protein families, indicating that the sets of common proteins have undergone differential expansion and contraction in the three multicellular eukaryotes. This is the first complete genome sequence of a plant and provides the foundations for more comprehensive comparison of conserved processes in all eukaryotes, identifying a wide range of plant-specific gene functions and establishing rapid systematic ways to identify genes for crop improvement.

[1]  C. M. Jones,et al.  Cucumber Beetle Resistance and Mite Susceptibility Controlled by the Bitter Gene in Cucumis sativus L , 1971, Science.

[2]  J. Rudich,et al.  Ethylene evolution from cucumber plants as related to sex expression. , 1972, Plant physiology.

[3]  C. Slayman,et al.  Depolarization of the plasma membrane of Neurospora during active transport of glucose: evidence for a proton-dependent cotransport system. , 1974, Proceedings of the National Academy of Sciences of the United States of America.

[4]  城所 良明,et al.  The Salk Institute for Biological Studies(話題) , 1975 .

[5]  C. Somerville,et al.  Photorespiration-deficient Mutants of Arabidopsis thaliana Lacking Mitochondrial Serine Transhydroxymethylase Activity. , 1981, Plant physiology.

[6]  James McGhee,et al.  Methylation and gene control , 1982, Nature.

[7]  M. Jaffe,et al.  Further studies of auxin and ACC induced feminization in the cucumber plant using ethylene inhibitors. , 1984, Phyton.

[8]  F. Ausubel,et al.  Isolation of a higher eukaryotic telomere from Arabidopsis thaliana , 1988, Cell.

[9]  M. Gouy,et al.  Date of the monocot-dicot divergence estimated from chloroplast DNA sequence data. , 1989, Proceedings of the National Academy of Sciences of the United States of America.

[10]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[11]  E. Uberbacher,et al.  Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[12]  G. Martin,et al.  High density molecular linkage maps of the tomato and potato genomes. , 1992, Genetics.

[13]  S. Dellaporta,et al.  Sex determination gene TASSELSEED2 of maize encodes a short-chain alcohol dehydrogenase required for stage-specific floral organ abortion. , 1993, Cell.

[14]  J. Giraudat,et al.  RPS2 of Arabidopsis thaliana: a leucine-rich repeat class of plant disease resistance genes. , 1994, Science.

[15]  H. Ma,et al.  Isolation of cDNAs encoding guanine nucleotide-binding protein beta-subunit homologues from maize (ZGB1) and Arabidopsis (AGB1). , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[16]  N. Chua,et al.  Cyclic GMP and calcium mediate phytochrome phototransduction , 1994, Cell.

[17]  G. Igloi,et al.  Complete sequence of the maize chloroplast genome: gene content, hotspots of divergence and fine tuning of genetic information by transcript editing. , 1995, Journal of molecular biology.

[18]  A G Murzin,et al.  SCOP: a structural classification of proteins database for the investigation of sequences and structures. , 1995, Journal of molecular biology.

[19]  G. Moore,et al.  Cereal Genome Evolution: Grasses, line up and form a circle , 1995, Current Biology.

[20]  N. Mitsukawa,et al.  Generation of a high-quality P1 library of Arabidopsis suitable for chromosome walking , 1995 .

[21]  A. Sancar,et al.  Structure and Function of Transcription-Repair Coupling Factor , 1995, The Journal of Biological Chemistry.

[22]  K. Anderson,et al.  A conserved signaling pathway: the Drosophila toll-dorsal pathway. , 1996, Annual review of cell and developmental biology.

[23]  C. Pikaard,et al.  Two-dimensional RFLP analyses reveal megabase-sized clusters of rRNA gene variants in Arabidopsis thaliana, suggesting local spreading of variants as the mode for gene homogenization during concerted evolution. , 1996, The Plant journal : for cell and molecular biology.

[24]  Peter G. Korning,et al.  Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. , 1996, Nucleic acids research.

[25]  P. Fromme Structure and function of photosystem I. , 1996, Current opinion in structural biology.

[26]  J. Bennetzen,et al.  Nested Retrotransposons in the Intergenic Regions of the Maize Genome , 1996, Science.

[27]  H. Mewes,et al.  Overview of the yeast genome. , 1997, Nature.

[28]  S. Karlin,et al.  Prediction of complete gene structures in human genomic DNA. , 1997, Journal of molecular biology.

[29]  T. Lange Cloning gibberellin dioxygenase genes from pumpkin endosperm by heterologous expression of enzyme activities in Escherichia coli. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Mike Tyers,et al.  F-Box Proteins Are Receptors that Recruit Phosphorylated Substrates to the SCF Ubiquitin-Ligase Complex , 1997, Cell.

[31]  N. W. Davis,et al.  The complete genome sequence of Escherichia coli K-12. , 1997, Science.

[32]  J. Bennetzen,et al.  Do Plants Have a One-Way Ticket to Genomic Obesity? , 1997, The Plant cell.

[33]  Y. Nakamura,et al.  Structural analysis of Arabidopsis thaliana chromosome 5. I. Sequence features of the 1.6 Mb regions covered by twenty physically assigned P1 clones. , 1997, DNA research : an international journal for rapid publication of reports on genes and genomes.

[34]  M. Adams,et al.  A tool for analyzing and annotating genomic sequences. , 1997, Genomics.

[35]  E. Richards,et al.  Arabidopsis thaliana centromere regions: genetic map positions and repetitive DNA structure. , 1997, Genome research.

[36]  J. Chory,et al.  Light control of plant development. , 1997, Annual review of cell and developmental biology.

[37]  A. Brennicke,et al.  The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides , 1997, Nature Genetics.

[38]  B. Gaut,et al.  DNA sequence evidence for the segmental allotetraploid origin of maize. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[39]  S. Eddy,et al.  tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. , 1997, Nucleic acids research.

[40]  E. Koonin,et al.  Second Family of Histone Deacetylases , 1998, Science.

[41]  Andrew Smith Genome sequence of the nematode C-elegans: A platform for investigating biology , 1998 .

[42]  S. Armstrong,et al.  Cytogenetics for the model system Arabidopsis thaliana. , 1998, The Plant journal : for cell and molecular biology.

[43]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1998, German Conference on Bioinformatics.

[44]  H. Shizuya,et al.  Construction and characterization of the IGF Arabidopsis BAC library , 1998, Molecular and General Genetics MGG.

[45]  N. Carpita,et al.  A Recipe for Cellulose , 1998, Science.

[46]  G. Pearce,et al.  Systemin: a polypeptide signal for plant defensive genes. , 1998, Annual review of cell and developmental biology.

[47]  J. Cherry,et al.  Arabidopsis thaliana: a model plant for genome analysis. , 1998, Science.

[48]  G. Jürgens,et al.  Cytokinesis in flowering plants: cellular process and developmental integration. , 1998, Current opinion in plant biology.

[49]  S. Tabata,et al.  LESSONS FROM SEQUENCING OF THE GENOME OF A UNICELLULAR CYANOBACTERIUM, SYNECHOCYSTIS SP. PCC6803. , 1998, Annual review of plant physiology and plant molecular biology.

[50]  C. Dean,et al.  Collinearity between a 30-centimorgan segment of Arabidopsis thaliana chromosome 4 and duplicated regions within the Brassica napus genome. , 1998, Genome.

[51]  I. Bancroft,et al.  BAC representation of two low-copy regions of the genome of Arabidopsis thaliana. , 1998, The Plant journal : for cell and molecular biology.

[52]  M H Saier,et al.  Unified inventory of established and putative transporters encoded within the complete genome of Saccharomyces cerevisiae , 1998, FEBS letters.

[53]  K. Yeh,et al.  Eukaryotic phytochromes: light-regulated serine/threonine protein kinases with histidine kinase ancestry. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[54]  V. Brendel,et al.  Prediction of locally optimal splice sites in plant pre-mRNA with applications to gene identification in Arabidopsis thaliana genomic DNA. , 1998, Nucleic acids research.

[55]  R. Martienssen Transposons, DNA methylation and gene control. , 1998, Trends in genetics : TIG.

[56]  S. Henikoff,et al.  A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis. , 1998, Genetics.

[57]  B C Meyers,et al.  Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. , 1998, Genome research.

[58]  M. Borodovsky,et al.  GeneMark.hmm: new solutions for gene finding. , 1998, Nucleic acids research.

[59]  E. A. van der Biezen,et al.  Plant disease-resistance proteins and the gene-for-gene concept. , 1998, Trends in biochemical sciences.

[60]  M. Sussman,et al.  A role for the AKT1 potassium channel in plant nutrition. , 1998, Science.

[61]  A EisenJ,et al.  DNA修復遺伝子,タンパクと過程のphylogenomic(系統発生的ゲノム)調査 , 1999 .

[62]  Y. Nakamura,et al.  Complete structure of the chloroplast genome of Arabidopsis thaliana. , 1999, DNA research : an international journal for rapid publication of reports on genes and genomes.

[63]  M. Marra,et al.  Genetic definition and sequence analysis of Arabidopsis centromeres. , 1999, Science.

[64]  W. Müller,et al.  Evolutionary relationships of Metazoa within the eukaryotes based on molecular data from Porifera , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[65]  S. Salzberg,et al.  Interpolated Markov models for eukaryotic gene finding. , 1999, Genomics.

[66]  W. J. Lucas,et al.  Plant paralog to viral movement protein that potentiates transport of mRNA into the phloem. , 1999, Science.

[67]  Burkhard Morgenstern,et al.  DIALIGN2: Improvement of the segment to segment approach to multiple sequence alignment , 1999, German Conference on Bioinformatics.

[68]  E. Meyerowitz Plants, animals and the logic of development. , 1999, Trends in cell biology.

[69]  P. Hanawalt,et al.  A phylogenomic study of DNA repair genes, proteins, and processes. , 1999, Mutation research.

[70]  C. Hall,et al.  Identification and analysis of homoeologous segments of the genomes of rice and Arabidopsis thaliana. , 1999, Genome.

[71]  E. A. van der Biezen,et al.  Pronounced Intraspecific Haplotype Divergence at the RPP5 Complex Disease Resistance Locus of Arabidopsis , 1999, Plant Cell.

[72]  M. Pirrung Ethylene Biosynthesis from 1‐Aminocyclopropanecarboxylic Acid , 1999 .

[73]  J. Christie,et al.  LOV (light, oxygen, or voltage) domains of the blue-light photoreceptor phototropin (nph1): binding sites for the chromophore flavin mononucleotide. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[74]  Sebastian Kloska,et al.  A complete BAC-based physical map of the Arabidopsis thaliana genome , 1999, Nature Genetics.

[75]  S. Salzberg,et al.  Alignment of whole genomes. , 1999, Nucleic acids research.

[76]  M. Hung,et al.  Drosophila proteins related to vertebrate DNA (5-cytosine) methyltransferases. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[77]  M. Cotton,et al.  Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana , 1999, Nature.

[78]  S. Eddy,et al.  A computational screen for methylation guide snoRNAs in yeast. , 1999, Science.

[79]  J. Kieber,et al.  Phosphorelay signal transduction: the emerging family of plant response regulators. , 1999, Trends in biochemical sciences.

[80]  T. Moritz,et al.  The Arabidopsis Dwarf Mutant shi Exhibits Reduced Gibberellin Responses Conferred by Overexpression of a New Putative Zinc Finger Protein , 1999, Plant Cell.

[81]  E. Huala,et al.  Blue-light photoreceptors in higher plants. , 1999, Annual review of cell and developmental biology.

[82]  Sudhir Kumar,et al.  Divergence time estimates for the early history of animal phyla and the origin of plants, animals and fungi , 1999, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[83]  Eugen C. Buehler,et al.  Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana , 1999, Nature.

[84]  D. Shibata,et al.  Complementation of plant mutants with large genomic DNA fragments by a transformation-competent artificial chromosome vector accelerates positional cloning. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[85]  Marco Marra,et al.  A map for sequence analysis of the Arabidopsis thaliana genome , 1999, Nature Genetics.

[86]  Britt,et al.  Molecular genetics of DNA repair in higher plants. , 1999, Trends in plant science.

[87]  N. Raikhel,et al.  Unique features of the plant vacuolar sorting machinery. , 2000, Current opinion in cell biology.

[88]  C R Somerville,et al.  The cellulose synthase superfamily. , 2000, Plant physiology.

[89]  D. E. Somers,et al.  Cloning of the Arabidopsis clock gene TOC1, an autoregulatory response regulator homolog. , 2000, Science.

[90]  Yangrae Cho,et al.  Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[91]  Tamas Dalmay,et al.  An RNA-Dependent RNA Polymerase Gene in Arabidopsis Is Required for Posttranscriptional Gene Silencing Mediated by a Transgene but Not by a Virus , 2000, Cell.

[92]  I. Bancroft,et al.  Comparative physical mapping of segments of the genome of Brassica oleracea var. alboglabra that are homoeologous to sequenced regions of chromosomes 4 and 5 of Arabidopsis thaliana. , 2000, The Plant journal : for cell and molecular biology.

[93]  Stephen M. Mount,et al.  The genome sequence of Drosophila melanogaster. , 2000, Science.

[94]  S Wright,et al.  Transposon diversity in Arabidopsis thaliana. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[95]  J. Riechmann,et al.  A genomic perspective on plant transcription factors. , 2000, Current opinion in plant biology.

[96]  E. Stahl,et al.  Plant-pathogen arms races at the molecular level. , 2000, Current opinion in plant biology.

[97]  J. Mutterer,et al.  Higher plant cells: Gamma‐tubulin and microtubule nucleation in the absence of centrosomes , 2000, Microscopy research and technique.

[98]  G. Ditta,et al.  B and C floral organ identity functions require SEPALLATA MADS-box genes , 2000, Nature.

[99]  R. Wilson,et al.  The Complete Sequence of a Heterochromatic Island from a Higher Eukaryote , 2000, Cell.

[100]  M. Delseny,et al.  Extensive Duplication and Reshuffling in the Arabidopsis Genome , 2000, Plant Cell.

[101]  J. Ecker,et al.  Ethylene signaling: from mutants to molecules. , 2000, Current opinion in plant biology.

[102]  I. Paulsen,et al.  Microbial genome analyses: comparative transport capabilities in eighteen prokaryotes. , 2000, Journal of molecular biology.

[103]  C. Dean,et al.  Integrated Cytogenetic Map of Chromosome Arm 4S of A. thaliana Structural Organization of Heterochromatic Knob and Centromere Region , 2000, Cell.

[104]  The Chinese Human Genome Sequencing Consortium,et al.  Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana , 2000, Nature.

[105]  T. Mizuno,et al.  Genes encoding pseudo-response regulators: insight into His-to-Asp phosphorelay and circadian rhythm in Arabidopsis thaliana. , 2000, Plant & cell physiology.

[106]  C. Feschotte,et al.  Evidence that a family of miniature inverted-repeat transposable elements (MITEs) from the Arabidopsis thaliana genome has arisen from a pogo-like DNA transposon. , 2000, Molecular biology and evolution.

[107]  A. Weissman,et al.  RING Finger Proteins Mediators of Ubiquitin Ligase Activity , 2000, Cell.

[108]  J. Dangl,et al.  Signal transduction in the plant immune response. , 2000, Trends in biochemical sciences.

[109]  I. Feussner,et al.  Fatty acid 9‐ and 13‐hydroperoxide lyases from cucumber1 , 2000, FEBS letters.

[110]  J. Murray,et al.  Triggering the cell cycle in plants. , 2000, Trends in cell biology.

[111]  J. Ohlrogge,et al.  Toward a functional catalog of the plant genome. A survey of genes for lipid biosynthesis. , 2000, Plant physiology.

[112]  K. Torii Receptor kinase activation and signal transduction in plants: an emerging picture. , 2000, Current opinion in plant biology.

[113]  Nathan M. Springer,et al.  Conserved plant genes with similarity to mammalian de novo DNA methyltransferases. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[114]  S. Brunak,et al.  Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. , 2000, Journal of molecular biology.

[115]  S. Wright,et al.  Mutator-like elements in Arabidopsis thaliana. Structure, diversity and evolution. , 2000, Genetics.

[116]  E. Koonin,et al.  Identification of paracaspases and metacaspases: two ancient families of caspase-like proteins, one of which plays a key role in MALT lymphoma. , 2000, Molecular cell.

[117]  P. Dodds,et al.  Structure, function and evolution of plant disease resistance genes. , 2000, Current opinion in plant biology.

[118]  Joanne Chory,et al.  Conservation and Innovation in Plant Signaling Pathways , 2000, Cell.

[119]  K. Shinozaki,et al.  Two-component systems in plant signal transduction. , 2000, Trends in plant science.

[120]  John Quackenbush,et al.  The TIGR Gene Indices: reconstruction and representation of expressed gene sequences , 2000, Nucleic Acids Res..

[121]  R. Schmidt,et al.  Comparative genome analysis reveals extensive conservation of genome organisation for Arabidopsis thaliana and Capsella rubella. , 2000, The Plant journal : for cell and molecular biology.

[122]  European Union Chromosome 3 Arabidopsis Genome Sequencing Consortium,et al.  Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana , 2000, Nature.

[123]  G M Coruzzi,et al.  National Science Foundation-Sponsored Workshop Report: "The 2010 Project" functional genomics and the virtual plant. A blueprint for understanding how plants are built and how to improve them. , 2000, Plant physiology.

[124]  S. Tanksley,et al.  Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[125]  M Koornneef,et al.  Naturally occurring variation in Arabidopsis: an underexploited resource for plant genetics. , 2000, Trends in plant science.

[126]  Yuval Eshed,et al.  SHATTERPROOF MADS-box genes control seed dispersal in Arabidopsis , 2000, Nature.

[127]  V. Solovyev,et al.  Ab initio gene finding in Drosophila genomic DNA. , 2000, Genome research.

[128]  Daniel J. Cosgrove,et al.  Loosening of plant cell walls by expansins , 2000, Nature.

[129]  Z. Yang,et al.  The Rrop GTPase switch turns on polar growth in pollen. , 2000, Trends in plant science.

[130]  Paul Shinn,et al.  Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana , 2000, Nature.

[131]  R. Buescher,et al.  Production and Stability of (E, Z)‐2, 6‐Nonadienal, the Major Flavor Volatile of Cucumbers , 2001 .

[132]  Mark W. Chase,et al.  Evolution of the angiosperms: calibrating the family tree , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[133]  Dmitrij Frishman,et al.  Functional and structural genomics using PEDANT , 2001, Bioinform..

[134]  E. Kellogg,et al.  Evolutionary history of the grasses. , 2001, Plant physiology.

[135]  Jerrold I. Davis,et al.  Phylogeny and subfamilial classification of the grasses (Poaceae) , 2001 .

[136]  J. Draper,et al.  Brachypodium distachyon. A new model system for functional genomics in grasses. , 2001, Plant physiology.

[137]  Andrea Brandolini,et al.  Genetics and geography of wild cereal domestication in the near east , 2002, Nature Reviews Genetics.

[138]  Huanming Yang,et al.  RePS: a sequence assembler that masks exact repeats identified from the shotgun data. , 2002, Genome research.

[139]  M. Thomas,et al.  Association of dwarfism and floral induction with a grape ‘green revolution’ mutation , 2002, Nature.

[140]  Huanming Yang,et al.  A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. indica) , 2002, Science.

[141]  Brandon S. Gaut,et al.  Evolutionary dynamics of grass genomes , 2002 .

[142]  Yasuko Takahashi,et al.  Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events , 2022 .