GOOGA: A platform to synthesize mapping experiments and identify genomic structural diversity

Understanding genomic structural variation such as inversions and translocations is a key challenge in evolutionary genetics. We develop a novel statistical approach to comparative genetic mapping to detect large-scale structural mutations from low-level sequencing data. The procedure, called Genome Order Optimization by Genetic Algorithm (GOOGA), couples a Hidden Markov Model with a Genetic Algorithm to analyze data from genetic mapping populations. We demonstrate the method using both simulated data (calibrated from experiments on Drosophila melanogaster) and real data from five distinct crosses within the flowering plant genus Mimulus. Application of GOOGA to the Mimulus data corrects numerous errors (misplaced sequences) in the M. guttatus reference genome and confirms or detects eight large inversions polymorphic within the species complex. Finally, we show how this method can be applied in genomic scans to improve the accuracy and resolution of Quantitative Trait Locus (QTL) mapping.

[1]  M. Blaxter,et al.  RADSeq: next-generation population genetics. , 2010, Briefings in functional genomics.

[2]  S. Nuzhdin,et al.  Promises and limitations of hitchhiking mapping. , 2013, Current opinion in genetics & development.

[3]  Charis Cardeno,et al.  Sequence-Based Detection and Breakpoint Assembly of Polymorphic Inversions , 2012, Genetics.

[4]  Xiaofei Wang,et al.  Mapping QTL Contributing to Variation in Posterior Lobe Morphology between Strains of Drosophila melanogaster , 2016, PloS one.

[5]  Angel Amores,et al.  Stacks: an analysis tool set for population genomics , 2013, Molecular ecology.

[6]  Scott J Emrich,et al.  Chromosomal inversions and ecotypic differentiation in Anopheles gambiae: the perspective from whole‐genome sequencing , 2016, Molecular ecology.

[7]  J. Kelly,et al.  Centromere‐associated meiotic drive and female fitness variation in Mimulus , 2015, Evolution; international journal of organic evolution.

[8]  J. Willis,et al.  A genetic map in the Mimulus guttatus species complex reveals transmission ratio distortion due to heterospecific interactions. , 2001, Genetics.

[9]  D. Gianola,et al.  A Genome-Wide Scan for Evidence of Selection in a Maize Population Under Long-Term Artificial Selection for Ear Number , 2013, Genetics.

[10]  L. Rieseberg,et al.  Chromosomal Evolution and Patterns of Introgression in Helianthus , 2014, Genetics.

[11]  Ruiqiang Li,et al.  De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits , 2014, Nature Biotechnology.

[12]  Arpiar Saunders,et al.  Centromere-Associated Female Meiotic Drive Entails Male Fitness Costs in Monkeyflowers , 2008, Science.

[13]  Laurent Keller,et al.  Supergenes and Complex Phenotypes , 2014, Current Biology.

[14]  Gabor T. Marth,et al.  An integrated map of structural variation in 2,504 human genomes , 2015, Nature.

[15]  R. K. Vickery Case Studies in the Evolution of Species Complexes in Mimulus , 1978 .

[16]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[17]  Jijun Tang,et al.  Scaling up accurate phylogenetic reconstruction from gene-order data , 2003, ISMB.

[18]  T. Cezard,et al.  Special features of RAD Sequencing data: implications for genotyping , 2012, Molecular ecology.

[19]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[20]  Lex E. Flagel,et al.  Speciation and Introgression between Mimulus nasutus and Mimulus guttatus , 2013, bioRxiv.

[21]  J. Willis,et al.  A Segregating Inversion Generates Fitness Variation in Yellow Monkeyflower (Mimulus guttatus) , 2016, Genetics.

[22]  J. M. Comeron,et al.  The Many Landscapes of Recombination in Drosophila melanogaster , 2012, PLoS genetics.

[23]  B. Koseva,et al.  A High-Resolution Genetic Map of Yellow Monkeyflower Identifies Chemical Defense QTLs and Recombination Rate Variation , 2014, G3: Genes, Genomes, Genetics.

[24]  M. Wellenreuther,et al.  Local adaptation along an environmental cline in a species with an inversion polymorphism , 2017, Journal of evolutionary biology.

[25]  P. Stankiewicz,et al.  Structural variation in the human genome and its role in disease. , 2010, Annual review of medicine.

[26]  B. Williams,et al.  An Integrated Physical and Genetic Map of the Rice Genome , 2002, The Plant Cell Online.

[27]  Jan Vrána,et al.  BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes , 2016, Plant biotechnology journal.

[28]  A. Long,et al.  Elucidating the molecular architecture of adaptation via evolve and resequence experiments , 2015, Nature Reviews Genetics.

[29]  Matthew W. Hahn,et al.  Sequencing, Assembling, and Correcting Draft Genomes Using Recombinant Populations , 2014, G3: Genes, Genomes, Genetics.

[30]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[31]  Daniel Gianola,et al.  Defining window-boundaries for genomic analyses using smoothing spline techniques , 2015, Genetics Selection Evolution.

[32]  B. Koseva,et al.  The Genomic Signal of Partial Sweeps in Mimulus guttatus , 2013, Genome biology and evolution.

[33]  J. Willis,et al.  Population structure and local selection yield high genomic variation in Mimulus guttatus , 2017, Molecular ecology.

[34]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[35]  Nicholas H. Putnam,et al.  Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication , 2013, GigaScience.

[36]  Russell B. Corbett-Detig,et al.  Population Genomics of Inversion Polymorphisms in Drosophila melanogaster , 2012, PLoS genetics.

[37]  E. Dopman,et al.  A combination of sexual and ecological divergence contributes to rearrangement spread during initial stages of speciation , 2017, Molecular ecology.

[38]  Nicholas W. VanKuren,et al.  Hidden genetic variation shapes the structure of functional elements in Drosophila , 2017, Nature Genetics.

[39]  S. Shu,et al.  The Peach v2.0 release: high-resolution linkage mapping and deep resequencing improve chromosome-scale assembly and contiguity , 2017, BMC Genomics.

[40]  Todd J. Vision,et al.  The Standing Pool of Genomic Structural Variation in a Natural Population of Mimulus guttatus , 2013, Genome biology and evolution.

[41]  P. Morrell,et al.  Megabase-Scale Inversion Polymorphism in the Wild Ancestor of Maize , 2012, Genetics.

[42]  Erich Bornberg-Bauer,et al.  Genome‐wide patterns of standing genetic variation in a marine population of three‐spined sticklebacks , 2013, Molecular ecology.

[43]  P. Feulner,et al.  Genome evolution, structural rearrangements and speciation , 2017, Journal of evolutionary biology.

[44]  J. Willis,et al.  A Widespread Chromosomal Inversion Polymorphism Contributes to a Major Life-History Transition, Local Adaptation, and Reproductive Isolation , 2010, PLoS biology.

[45]  P. Andolfatto,et al.  GENETIC ARCHITECTURE AND ADAPTIVE SIGNIFICANCE OF THE SELFING SYNDROME IN CAPSELLA , 2012, Evolution; international journal of organic evolution.

[46]  C. Aquadro,et al.  Gene flow and gene flux shape evolutionary patterns of variation in Drosophila subobscura , 2013, Heredity.

[47]  K. Samuk Inversions and the origin of behavioral differences in cod , 2016, Molecular ecology.

[48]  E. Betrán,et al.  Recombination and gene flux caused by gene conversion and crossing over in inversion heterokaryotypes. , 1997, Genetics.

[49]  A. della Torre,et al.  A Polytene Chromosome Analysis of the Anopheles gambiae Species Complex , 2002, Science.

[50]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[51]  R. Martienssen,et al.  Molecular, genetic and evolutionary analysis of a paracentric inversion in Arabidopsis thaliana , 2016, The Plant journal : for cell and molecular biology.

[52]  Pjotr Prins,et al.  R/qtl: high-throughput multiple QTL mapping , 2010, Bioinform..

[53]  K. Broman,et al.  A Guide to QTL Mapping with R/qtl , 2009 .

[54]  J. Chapman,et al.  Anchoring and ordering NGS contig assemblies by population sequencing (POPSEQ) , 2013, The Plant journal : for cell and molecular biology.

[55]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[56]  Seishi Ninomiya,et al.  AntMap: Constructing Genetic Linkage Maps Using an Ant Colony Optimization Algorithm , 2006 .

[57]  Diego Ayala,et al.  Adaptation through chromosomal inversions in Anopheles , 2014, Front. Genet..

[58]  J. Willis,et al.  Comparative linkage maps suggest that fission, not polyploidy, underlies near-doubling of chromosome number within monkeyflowers (Mimulus; Phrymaceae) , 2014, Heredity.

[59]  Deren A. R. Eaton,et al.  PyRAD: assembly of de novo RADseq loci for phylogenetic analyses , 2013, bioRxiv.

[60]  Xuehui Huang,et al.  High-throughput genotyping by whole-genome resequencing. , 2009, Genome research.

[61]  E. Tuttle,et al.  Divergence and Functional Degradation of a Sex Chromosome-like Supergene , 2016, Current Biology.

[62]  T. Cezard,et al.  The effect of RAD allele dropout on the estimation of genetic variation within and between populations , 2013, Molecular ecology.

[63]  A. Long,et al.  Properties and Power of the Drosophila Synthetic Population Resource for the Routine Dissection of Complex Traits , 2012, Genetics.

[64]  High-resolution linkage map and chromosome-scale genome assembly for cassava (Manihot esculenta Crantz) from 10 populations , 2015 .

[65]  S. Wessler,et al.  Fine-scale variation in meiotic recombination in Mimulus inferred from population shotgun sequencing , 2013, Proceedings of the National Academy of Sciences.

[66]  J. Uzunović,et al.  Coevolution between transposable elements and recombination , 2017, Philosophical Transactions of the Royal Society B: Biological Sciences.

[67]  M. Kirkpatrick,et al.  Prezygotic isolation, mating preferences, and the evolution of chromosomal inversions , 2016, Evolution; international journal of organic evolution.

[68]  I. Hellmann,et al.  Massive genomic variation and strong selection in Arabidopsis thaliana lines from Sweden , 2013, Nature Genetics.

[69]  G. Coop,et al.  Adaptation to heavy-metal contaminated environments proceeds via selection on pre-existing genetic variation , 2015, bioRxiv.

[70]  Tina T. Hu,et al.  Multiplexed shotgun genotyping for rapid and efficient genetic mapping. , 2011, Genome research.

[71]  M. Rausher,et al.  SNP‐skimming: A fast approach to map loci generating quantitative variation in natural populations , 2018, Molecular ecology resources.

[72]  M. Hufford,et al.  Complex Patterns of Local Adaptation in Teosinte , 2012, Genome biology and evolution.

[73]  Tomasz Burzykowski,et al.  A Nonhomogeneous Hidden Markov Model for Gene Mapping Based on Next-Generation Sequencing Data , 2015, J. Comput. Biol..

[74]  D. Delneri,et al.  Widespread Impact of Chromosomal Inversions on Gene Expression Uncovers Robustness via Phenotypic Buffering , 2016, Molecular biology and evolution.

[75]  J. Colicchio,et al.  A genomic selection component analysis characterizes migration‐selection balance , 2015, Evolution; international journal of organic evolution.

[76]  Patrick S. Schnable,et al.  Maize Inbreds Exhibit High Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in Genome Content , 2009, PLoS genetics.

[77]  Xun Xu,et al.  Comparative population genomics of maize domestication and improvement , 2012, Nature Genetics.

[78]  Kevin L. Childs,et al.  Draft Assembly of Elite Inbred Line PH207 Provides Insights into Genomic and Transcriptome Diversity in Maize[OPEN] , 2016, Plant Cell.

[79]  Guido Jenster,et al.  CGtag: complete genomics toolkit and annotation in a cloud-based Galaxy , 2014, GigaScience.

[80]  Robert D Schnabel,et al.  SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries , 2008, Nature Methods.

[81]  Austin G. Garner,et al.  Genetic loci with parent-of-origin effects cause hybrid seed lethality in crosses between Mimulus species. , 2016, The New phytologist.

[82]  M. Rose,et al.  Tracking changes in chromosomal arrangements and their genetic content during adaptation , 2016, Journal of evolutionary biology.