The complex sequence landscape of maize revealed by single molecule technologies

Complete and accurate reference genomes and annotations provide fundamental tools for characterization of genetic and functional variation. These resources facilitate elucidation of biological processes and support translation of research findings into improved and sustainable agricultural technologies. Many reference genomes for crop plants have been generated over the past decade, but these genomes are often fragmented and missing complex repeat regions. Here, we report the assembly and annotation of maize, a genetic and agricultural model crop, using Single Molecule Real-Time (SMRT) sequencing and high-resolution genome map. Relative to the previous reference genome, our assembly features a 52-fold increase in contig length and significant improvements in the assembly of intergenic spaces and centromeres. Characterization of the repetitive portion of the genome revealed over 130,000 intact transposable elements (TEs), allowing us to identify TE lineage expansions unique to maize. Gene annotations were updated using 111,000 full-length transcripts obtained by SMRT sequencing. In addition, comparative optical mapping of two other inbreds revealed a prevalence of deletions in the region of low gene density region and maize lineage-specific genes.

[1]  Tyson A. Clark,et al.  Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing , 2016, Nature Communications.

[2]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[3]  Daniel L. Vera,et al.  Open chromatin reveals the functional maize genome , 2016, Proceedings of the National Academy of Sciences.

[4]  P. Kwok,et al.  A Hybrid Approach for de novo Human Genome Sequence Assembly and Phasing , 2016, Nature Methods.

[5]  Kevin L. Schneider,et al.  Inbreeding drives maize centromere evolution , 2016, Proceedings of the National Academy of Sciences.

[6]  Russell E. Durrett,et al.  Assembly and diploid architecture of an individual human genome via single-molecule technologies , 2015, Nature Methods.

[7]  Jiming Jiang,et al.  Stable Patterns of CENH3 Occupancy Through Maize Lineages Containing Genetically Similar Centromeres , 2015, Genetics.

[8]  Jeffrey Ross-Ibarra,et al.  Genetic, evolutionary and plant breeding insights from the domestication of maize , 2015, eLife.

[9]  Xun Xu,et al.  Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology , 2014, GigaScience.

[10]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[11]  Jikai Lei,et al.  Automated Update, Revision, and Quality Control of the Maize Genome Annotations Using MAKER-P Improves the B73 RefGen_v3 Gene Models and Identifies New Genes1[OPEN] , 2014, Plant Physiology.

[12]  Peter J. Bradbury,et al.  Association Mapping across Numerous Traits Reveals Patterns of Functional Variation in Maize , 2014, bioRxiv.

[13]  J. Landolin,et al.  Assembling large genomes with single-molecule sequencing and locality-sensitive hashing , 2014, Nature Biotechnology.

[14]  Chunguang Du,et al.  HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes , 2014, Proceedings of the National Academy of Sciences.

[15]  Rajeev K. Varshney,et al.  Structural variations in plant genomes , 2014, Briefings in functional genomics.

[16]  Carolyn J. Lawrence-Dill,et al.  MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN] , 2013, Plant Physiology.

[17]  B. J. Atwell,et al.  Serpins in rice: protein sequence analysis, phylogeny and gene expression during development , 2012, BMC Genomics.

[18]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[19]  Peter J. Bradbury,et al.  Maize HapMap2 identifies extant variation from a genome in flux , 2012, Nature Genetics.

[20]  Robert Fluhr,et al.  Serpin protease inhibitors in plant biology. , 2012, Physiologia plantarum.

[21]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[22]  Edward S. Buckler,et al.  Crop genomics: advances and applications , 2011, Nature Reviews Genetics.

[23]  O. Martin,et al.  A Large Maize (Zea mays L.) SNP Genotyping Array: Development and Germplasm Genotyping, and Genetic Mapping to Compare with the B73 Reference Genome , 2011, PloS one.

[24]  Federico Martin,et al.  Maize Rough Endosperm3 Encodes an RNA Splicing Factor Required for Endosperm Cell Differentiation and Has a Nonautonomous Effect on Embryo Development[C][W][OA] , 2011, Plant Cell.

[25]  Bernd Weisshaar,et al.  Targeted Identification of Short Interspersed Nuclear Element Families Shows Their Widespread Existence and Extreme Heterogeneity in Plant Genomes[W] , 2011, Plant Cell.

[26]  James C. Schnable,et al.  Differentiation of the maize subgenomes by genome dominance and both ancient and ongoing gene loss , 2011, Proceedings of the National Academy of Sciences.

[27]  Peter Tiffin,et al.  Pervasive gene content variation and copy number variation in maize and its undomesticated progenitor. , 2010, Genome research.

[28]  Susan R. Wessler,et al.  MITE-Hunter: a program for discovering miniature inverted-repeat transposable elements from genomic sequences , 2010, Nucleic acids research.

[29]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[30]  T. Graves,et al.  The Physical and Genetic Framework of the Maize B73 Genome , 2009, PLoS genetics.

[31]  Cristian Chaparro,et al.  Exceptional Diversity, Non-Random Distribution, and Rapid Evolution of Retroelements in the B73 Maize Genome , 2009, PLoS genetics.

[32]  Carol Soderlund,et al.  Sequencing, Mapping, and Analysis of 27,455 Maize Full-Length cDNAs , 2009, PLoS genetics.

[33]  Josh Strable,et al.  Maize (Zea mays): a model organism for basic and applied research in plant biology. , 2009, Cold Spring Harbor protocols.

[34]  S. Kurtz,et al.  Fine-grained annotation and classification of de novo predicted LTR retrotransposons , 2009, Nucleic acids research.

[35]  Haixu Tang,et al.  MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes , 2009, Nucleic acids research.

[36]  M. McMullen,et al.  Genetic Properties of the Maize Nested Association Mapping Population , 2009, Science.

[37]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[38]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[39]  Kevin L. Schneider,et al.  Sustained retrotransposition is mediated by nucleotide deletions and interelement recombinations , 2008, Proceedings of the National Academy of Sciences.

[40]  H. Dooner,et al.  Maize Genome Structure Variation: Interplay between Retrotransposon Polymorphisms and Genic Recombination[W] , 2008, The Plant Cell Online.

[41]  Stefan Kurtz,et al.  LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons , 2008, BMC Bioinformatics.

[42]  R. Martienssen,et al.  Transposable elements and the epigenetic regulation of the genome , 2007, Nature Reviews Genetics.

[43]  Brandon S Gaut,et al.  Molecular and functional diversity of maize. , 2006, Current opinion in plant biology.

[44]  Andy Pereira,et al.  Faculty Opinions recommendation of Evolution of DNA sequence nonhomologies among maize inbreds. , 2005 .

[45]  J. Bennetzen,et al.  Gene loss and movement in the maize genome. , 2004, Genome research.

[46]  S. Salzberg,et al.  Versatile and open software for comparing large genomes , 2004, Genome Biology.

[47]  J. Bennetzen,et al.  Nested Retrotransposons in the Intergenic Regions of the Maize Genome , 1996, Science.

[48]  B. Mcclintock The origin and behavior of mutable loci in maize , 1950, Proceedings of the National Academy of Sciences.

[49]  J. Batley,et al.  Accessing complex crop genomes with next-generation sequencing , 2012, Theoretical and Applied Genetics.