Genome assembly of a tropical maize inbred line provides insights into structural variation and crop improvement

Maize is one of the most important crops globally, and it shows remarkable genetic diversity. Knowledge of this diversity could help in crop improvement; however, gold-standard genomes have been elucidated only for modern temperate varieties. Here, we present a high-quality reference genome (contig N50 of 15.78 megabases) of the maize small-kernel inbred line, which is derived from a tropical landrace. Using haplotype maps derived from B73, Mo17 and SK, we identified 80,614 polymorphic structural variants across 521 diverse lines. Approximately 22% of these variants could not be detected by traditional single-nucleotide-polymorphism-based approaches, and some of them could affect gene expression and trait performance. To illustrate the utility of the diverse SK line, we used it to perform map-based cloning of a major effect quantitative trait locus controlling kernel weight—a key trait selected during maize improvement. The underlying candidate gene ZmBARELY ANY MERISTEM1d provides a target for increasing crop yields.A high-quality reference genome of the maize SK inbred line and analyses between the tropical SK line and two other maize genomes, B73 and Mo17, provide insights into structural variation and crop improvement.

[1]  Russell E. Durrett,et al.  Assembly and diploid architecture of an individual human genome via single-molecule technologies , 2015, Nature Methods.

[2]  J. Doebley,et al.  Genetic signals of origin, spread, and introgression in a large sample of maize landraces , 2010, Proceedings of the National Academy of Sciences.

[3]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[4]  David Haussler,et al.  Using native and syntenically mapped cDNA alignments to improve de novo gene finding , 2008, Bioinform..

[5]  R. Gibbs,et al.  Mind the Gap: Upgrading Genomes with Pacific Biosciences RS Long-Read Sequencing Technology , 2012, PloS one.

[6]  Jianbing Yan,et al.  Association Mapping for Enhancing Maize (Zea mays L.) Genetic Improvement , 2011 .

[7]  Haibao Tang,et al.  Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum , 2015, Nature.

[8]  N. Weisenfeld,et al.  Direct determination of diploid genome sequences , 2016, bioRxiv.

[9]  Xiaohong Yang,et al.  Characterization of a global germplasm collection and its potential utilization for analysis of complex quantitative traits in maize , 2011, Molecular Breeding.

[10]  J. Cronan,et al.  Diversity in enoyl-acyl carrier protein reductases , 2009, Cellular and Molecular Life Sciences.

[11]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[12]  Christina A. Cuomo,et al.  Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement , 2014, PloS one.

[13]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[14]  Peter J. Bradbury,et al.  High-resolution genetic mapping of maize pan-genome sequence anchors , 2015, Nature Communications.

[15]  Dawn H. Nagel,et al.  The B73 Maize Genome: Complexity, Diversity, and Dynamics , 2009, Science.

[16]  Chunguang Du,et al.  HelitronScanner uncovers a large overlooked cache of Helitron transposons in many plant genomes , 2014, Proceedings of the National Academy of Sciences.

[17]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[18]  J. Doebley,et al.  A single domestication for maize shown by multilocus microsatellite genotyping , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Jianbing Yan,et al.  Genome-wide recombination dynamics are associated with phenotypic variation in maize. , 2016, The New phytologist.

[20]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, Genome Biology.

[21]  Yu Zhang,et al.  P-MITE: a database for plant miniature inverted-repeat transposable elements , 2013, Nucleic Acids Res..

[22]  T. Widiez,et al.  Signaling in Early Maize Kernel Development. , 2017, Molecular plant.

[23]  David Haussler,et al.  High-resolution comparative analysis of great ape genomes , 2018, Science.

[24]  Jianbing Yan,et al.  Multi-environment QTL analysis of grain morphology traits and fine mapping of a kernel-width QTL in Zheng58 × SK maize population , 2016, Theoretical and Applied Genetics.

[25]  J. Lohmann,et al.  From signals to stem cells and back again , 2018, Current opinion in plant biology.

[26]  Rajeev K. Varshney,et al.  Structural variations in plant genomes , 2014, Briefings in functional genomics.

[27]  Carolyn J. Lawrence-Dill,et al.  MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of Plant Genome Annotations1[W][OPEN] , 2013, Plant Physiology.

[28]  S. Mundlos,et al.  Structural variation in the 3D genome , 2018, Nature Reviews Genetics.

[29]  Jeffrey Ross-Ibarra,et al.  Improved maize reference genome with single-molecule technologies , 2017, Nature.

[30]  Justin Chu,et al.  ARCS: scaffolding genome drafts with linked reads , 2017, Bioinform..

[31]  O. Martin,et al.  A Large Maize (Zea mays L.) SNP Genotyping Array: Development and Germplasm Genotyping, and Genetic Mapping to Compare with the B73 Reference Genome , 2011, PloS one.

[32]  Jianbing Yan,et al.  Contributions of Zea mays subspecies mexicana haplotypes to modern maize , 2017, Nature Communications.

[33]  Andrew C. Adey,et al.  Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions , 2013, Nature Biotechnology.

[34]  Bernd Weisshaar,et al.  Targeted Identification of Short Interspersed Nuclear Element Families Shows Their Widespread Existence and Extreme Heterogeneity in Plant Genomes[W] , 2011, Plant Cell.

[35]  F. Ali,et al.  Genome Wide Association Studies Using a New Nonparametric Model Reveal the Genetic Architecture of 17 Agronomic Traits in an Enlarged Maize Association Panel , 2014, PLoS genetics.

[36]  S. Kurtz,et al.  Fine-grained annotation and classification of de novo predicted LTR retrotransposons , 2009, Nucleic acids research.

[37]  O. Kohany,et al.  Repbase Update, a database of repetitive elements in eukaryotic genomes , 2015, Mobile DNA.

[38]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[39]  Peter J. Bradbury,et al.  Dysregulation of expression correlates with rare-allele burden and fitness loss in maize , 2018, Nature.

[40]  Anders Krogh,et al.  Accurate genotyping across variant classes and lengths using variant graphs , 2018, Nature Genetics.

[41]  Stephen M. Mount,et al.  Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. , 2003, Nucleic acids research.

[42]  Evgeny M. Zdobnov,et al.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs , 2015, Bioinform..

[43]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[44]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[45]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[46]  Stefan Kurtz,et al.  LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons , 2008, BMC Bioinformatics.

[47]  Bruce D. Smith,et al.  The Molecular Genetics of Crop Domestication , 2006, Cell.

[48]  Jianbing Yan,et al.  Intraspecific variation of residual heterozygosity and its utility for quantitative genetic studies in maize , 2018, BMC Plant Biology.

[49]  Xiaohong Yang,et al.  Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels , 2012, Nature Genetics.

[50]  J. Cock,et al.  A large family of genes that share homology with CLAVATA3. , 2001, Plant physiology.

[51]  David Jackson,et al.  CLAVATA-WUSCHEL signaling in the shoot meristem , 2016, Development.

[52]  Daniel L. Vera,et al.  The maize W22 genome provides a foundation for functional genomics and transposon biology , 2018, Nature Genetics.

[53]  Paul T. Tarr,et al.  Plant stem cell maintenance by transcriptional cross-regulation of related receptor kinases , 2015, Development.

[54]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[55]  L. Mao,et al.  RNA‐guided Cas9 as an in vivo desired‐target mutator in maize , 2017, Plant biotechnology journal.

[56]  Shujun Ou,et al.  Assessing genome assembly quality using the LTR Assembly Index (LAI) , 2018, Nucleic acids research.

[57]  P. Schnable,et al.  Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes , 2018, Nature Genetics.

[58]  Shizhong Xu,et al.  Genome-wide dissection of the maize ear genetic architecture using multiple populations. , 2016, The New phytologist.

[59]  M. Schatz,et al.  Phased diploid genome assembly with single-molecule real-time sequencing , 2016, Nature Methods.

[60]  Xia Li,et al.  The Conserved and Unique Genetic Architecture of Kernel Size and Weight in Maize and Rice1[OPEN] , 2017, Plant Physiology.