Construction and Analysis of High-Density Linkage Map Using High-Throughput Sequencing Data

Linkage maps enable the study of important biological questions. The construction of high-density linkage maps appears more feasible since the advent of next-generation sequencing (NGS), which eases SNP discovery and high-throughput genotyping of large population. However, the marker number explosion and genotyping errors from NGS data challenge the computational efficiency and linkage map quality of linkage study methods. Here we report the HighMap method for constructing high-density linkage maps from NGS data. HighMap employs an iterative ordering and error correction strategy based on a k-nearest neighbor algorithm and a Monte Carlo multipoint maximum likelihood algorithm. Simulation study shows HighMap can create a linkage map with three times as many markers as ordering-only methods while offering more accurate marker orders and stable genetic distances. Using HighMap, we constructed a common carp linkage map with 10,004 markers. The singleton rate was less than one-ninth of that generated by JoinMap4.1. Its total map distance was 5,908 cM, consistent with reports on low-density maps. HighMap is an efficient method for constructing high-density, high-quality linkage maps from high-throughput population NGS data. It will facilitate genome assembling, comparative genomic analysis, and QTL studies. HighMap is available at http://highmap.biomarker.com.cn/.

[1]  Qian Qian,et al.  Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm , 2011, Nature Genetics.

[2]  R W Doerge,et al.  High-density haplotyping with microarray-based expression and single feature polymorphism markers in Arabidopsis. , 2006, Genome research.

[3]  Dongyuan Liu,et al.  SLAF-seq: An Efficient Method of Large-Scale De Novo SNP Discovery and Genotyping Using High-Throughput Sequencing , 2013, PloS one.

[4]  J. V. Ooijen,et al.  Multipoint maximum likelihood mapping in a full-sib family of an outbreeding species. , 2011 .

[5]  Si Quang Le,et al.  SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples. , 2011, Genome research.

[6]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[7]  Tina T. Hu,et al.  Multiplexed shotgun genotyping for rapid and efficient genetic mapping. , 2011, Genome research.

[8]  Olivier Harismendy,et al.  Accurate detection and genotyping of SNPs utilizing population sequencing data. , 2010, Genome research.

[9]  Chunfa Tong,et al.  A hidden Markov model approach to multilocus linkage analysis in a full-sib family , 2010, Tree Genetics & Genomes.

[10]  Richard G. F. Visser,et al.  SMOOTH: a statistical method for successful removal of genotyping errors from high-density genetic linkage data , 2005, Theoretical and Applied Genetics.

[11]  O. Martin,et al.  A Large Maize (Zea mays L.) SNP Genotyping Array: Development and Germplasm Genotyping, and Genetic Mapping to Compare with the B73 Reference Genome , 2011, PloS one.

[12]  Joshua S. Paul,et al.  Genotype and SNP calling from next-generation sequencing data , 2011, Nature Reviews Genetics.

[13]  J. Jansen,et al.  Constructing dense genetic linkage maps , 2001, Theoretical and Applied Genetics.

[14]  Jinxing Tu,et al.  An ultradense genetic recombination map for Brassica napus, consisting of 13551 SRAP markers , 2007, Theoretical and Applied Genetics.

[15]  Anton J. Enright,et al.  The zebrafish reference genome sequence and its relationship to the human genome , 2013, Nature.

[16]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[17]  J. Poulain,et al.  The genome of the mesopolyploid crop species Brassica rapa , 2011, Nature Genetics.

[18]  Petri Auvinen,et al.  Lep-MAP: fast and accurate linkage map construction for large SNP datasets , 2013, Bioinform..

[19]  J. Poulain,et al.  The genome of Theobroma cacao , 2011, Nature Genetics.

[20]  A. Amores,et al.  Genome Evolution and Meiotic Maps by Massively Parallel DNA Sequencing: Spotted Gar, an Outgroup for the Teleost Genome Duplication , 2011, Genetics.

[21]  N. Hall,et al.  The genetic map and comparative analysis with the physical map of Trypanosoma brucei , 2005, Nucleic acids research.

[22]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.

[23]  Qi Feng,et al.  Parent-independent genotyping for constructing an ultrahigh-density linkage map based on population sequencing , 2010, Proceedings of the National Academy of Sciences.

[24]  J. Poulain,et al.  The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla , 2007, Nature.

[25]  Richard G. F. Visser,et al.  RECORD: a novel method for ordering loci on a genetic linkage map , 2005, Theoretical and Applied Genetics.

[26]  Yan Zhang,et al.  A Consensus Linkage Map Provides Insights on Genome Character and Evolution in Common Carp (Cyprinus carpio L.) , 2013, Marine Biotechnology.

[27]  W. J. Lucas,et al.  The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions , 2012, Nature Genetics.

[28]  Steven B Cannon,et al.  High-throughput SNP discovery through deep resequencing of a reduced representation library to anchor and orient scaffolds in the soybean whole genome sequence , 2010, BMC Genomics.

[29]  Anete P. Souza,et al.  OneMap: software for genetic mapping in outcrossing species. , 2007, Hereditas.

[30]  Yi Li,et al.  The Genetic Map of Artemisia annua L. Identifies Loci Affecting Yield of the Antimalarial Drug Artemisinin , 2010, Science.

[31]  Mihaela M. Martis,et al.  A physical, genetic and functional sequence assembly of the barley genome. , 2022 .

[32]  Panwen Wang,et al.  A fast and accurate SNP detection algorithm for next-generation sequencing data , 2012, Nature Communications.

[33]  M. Matz,et al.  Construction of a high-resolution genetic linkage map and comparative genome analysis for the reef-building coral Acropora millepora , 2009, Genome Biology.

[34]  James Lu,et al.  An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data , 2013, Genome research.

[35]  Wei Chen,et al.  Genotype calling and haplotyping in parent-offspring trios , 2013, Genome research.

[36]  Daniel W. A. Buchan,et al.  The tomato genome sequence provides insights into fleshy fruit evolution , 2012, Nature.

[37]  R. Visser,et al.  Construction of a 10,000-Marker Ultradense Genetic Recombination Map of Potato: Providing a Framework for Accelerated Gene Isolation and a Genomewide Physical Map , 2006, Genetics.

[38]  Loren H. Rieseberg,et al.  Development of a 10,000 Locus Genetic Map of the Sunflower Genome Based on Multiple Crosses , 2012, G3: Genes | Genomes | Genetics.

[39]  J. V. van Ooijen Multipoint maximum likelihood mapping in a full-sib family of an outbreeding species. , 2011, Genetics research.

[40]  Rongling Wu,et al.  Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. , 2002, Theoretical population biology.

[41]  Martin S. Taylor,et al.  A High-Resolution Single Nucleotide Polymorphism Genetic Map of the Mouse Genome , 2006, PLoS biology.

[42]  Roger E Bumgarner,et al.  The genome of the domesticated apple (Malus × domestica Borkh.) , 2010, Nature Genetics.

[43]  C. Rodríguez,et al.  Recombination rates across porcine autosomes inferred from high-density linkage maps. , 2012, Animal genetics.