Genome Wide Sampling Sequencing for SNP Genotyping: Methods, Challenges and Future Development

Genetic polymorphisms, particularly single nucleotide polymorphisms (SNPs), have been widely used to advance quantitative, functional and evolutionary genomics. Ideally, all genetic variants among individuals should be discovered when next generation sequencing (NGS) technologies and platforms are used for whole genome sequencing or resequencing. In order to improve the cost-effectiveness of the process, however, the research community has mainly focused on developing genome-wide sampling sequencing (GWSS) methods, a collection of reduced genome complexity sequencing, reduced genome representation sequencing and selective genome target sequencing. Here we review the major steps involved in library preparation, the types of adapters used for ligation and the primers designed for amplification of ligated products for sequencing. Unfortunately, currently available GWSS methods have their drawbacks, such as inconsistency in the number of reads per sample library, the number of sites/targets per individual, and the number of reads per site/target, all of which result in missing data. Suggestions are proposed here to improve library construction, genotype calling accuracy, genome-wide marker density and read mapping rate. In brief, optimized GWSS library preparation should generate a unique set of target sites with dense distribution along chromosomes and even coverage per site across all individuals.

[1]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[2]  M. Matz,et al.  2b-RAD: a simple and flexible method for genome-wide genotyping , 2012, Nature Methods.

[3]  D. Merico,et al.  Genome-wide single nucleotide polymorphism and Insertion-Deletion discovery through next-generation sequencing of reduced representation libraries in common bean , 2013, Molecular Breeding.

[4]  H. Hoekstra,et al.  Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species , 2012, PloS one.

[5]  Hongyu Zhao,et al.  Flexible and scalable genotyping-by-sequencing strategies for population studies , 2014, BMC Genomics.

[6]  Samuele Bovo,et al.  Reduced Representation Libraries from DNA Pools Analysed with Next Generation Semiconductor Based-Sequencing to Identify SNPs in Extreme and Divergent Pigs for Back Fat Thickness , 2015, International journal of genomics.

[7]  P. Etter,et al.  Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers , 2008, PloS one.

[8]  Brian Boyle,et al.  An Improved Genotyping by Sequencing (GBS) Approach Offering Increased Versatility and Efficiency of SNP Discovery and Genotyping , 2013, PloS one.

[9]  Robert D Schnabel,et al.  SNP discovery and allele frequency estimation by deep sequencing of reduced representation libraries , 2008, Nature Methods.

[10]  P. Shannon,et al.  Exome sequencing identifies the cause of a Mendelian disorder , 2009, Nature Genetics.

[11]  Xiaoping Zhou,et al.  Genome Wide Screening of Candidate Genes for Improving Piglet Birth Weight Using High and Low Estimated Breeding Value Populations , 2014, International journal of biological sciences.

[12]  Sunday O. Peters,et al.  Genotyping-by-Sequencing (GBS): A Novel, Efficient and Cost-Effective Genotyping Method for Cattle Using Next-Generation Sequencing , 2013, PloS one.

[13]  S. P. Fodor,et al.  Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays , 2004, Nature Methods.

[14]  Robert J. Elshire,et al.  A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species , 2011, PloS one.

[15]  L. Alexander,et al.  Discovery of novel genetic networks associated with 19 economically important traits in beef cattle , 2009, International journal of biological sciences.

[16]  K. Gunderson,et al.  Whole genome genotyping technologies on the BeadArray™ platform , 2007 .

[17]  Bruno Studer,et al.  Genome Wide Allele Frequency Fingerprints (GWAFFs) of Populations via Genotyping by Sequencing , 2013, PloS one.

[18]  T. LaFramboise,et al.  Single nucleotide polymorphism arrays: a decade of biological, computational and technological advances , 2009, Nucleic acids research.

[19]  S. P. Fodor,et al.  Large-scale genotyping of complex DNA , 2003, Nature Biotechnology.

[20]  J. Poland,et al.  Application of Genotyping-by-Sequencing on Semiconductor Sequencing Platforms: A Comparison of Genetic and Reference-Based Marker Ordering in Barley , 2013, PloS one.

[21]  D. Rokhsar,et al.  Old can be new again: HAPPY whole genome sequencing, mapping and assembly , 2009, International journal of biological sciences.

[22]  A. Brookes The essence of SNPs. , 1999, Gene.

[23]  Riccardo Velasco,et al.  Fast and Cost-Effective Genetic Mapping in Apple Using Next-Generation Sequencing , 2014, G3: Genes, Genomes, Genetics.

[24]  Xiaoping Zhou,et al.  Whole transcriptome analysis with sequencing: methods, challenges and potential solutions , 2015, Cellular and Molecular Life Sciences.

[25]  S. Deschamps,et al.  Genotyping-by-Sequencing in Plants , 2012, Biology.

[26]  Antoine Janssen,et al.  Sequence-Based Genotyping for Marker Discovery and Co-Dominant Scoring in Germplasm and Populations , 2012, PloS one.

[27]  G. Valè,et al.  Identification of SNP and SSR markers in eggplant using RAD tag sequencing , 2011, BMC Genomics.

[28]  T. Robinson,et al.  Variation in the Form of Pavlovian Conditioned Approach Behavior among Outbred Male Sprague-Dawley Rats from Different Vendors and Colonies: Sign-Tracking vs. Goal-Tracking , 2013, PloS one.

[29]  C. Robin Buell,et al.  Marker Density and Read Depth for Genotyping Populations Using Genotyping-by-Sequencing , 2013, Genetics.

[30]  Xiaoping Zhou,et al.  Quantitative Genomics of 30 Complex Phenotypes in Wagyu x Angus F1 Progeny , 2012, International journal of biological sciences.

[31]  J. Poland,et al.  Development of High-Density Genetic Maps for Barley and Wheat Using a Novel Two-Enzyme Genotyping-by-Sequencing Approach , 2012, PloS one.

[32]  B. Goossens,et al.  Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms , 2014, BMC Genomics.

[33]  Xiaoping Zhou,et al.  Genome-Wide Genetic Diversity and Differentially Selected Regions among Suffolk, Rambouillet, Columbia, Polypay, and Targhee Sheep , 2013, PloS one.

[34]  Jan van Oeveren,et al.  Complexity Reduction of Polymorphic Sequences (CRoPS™): A Novel Approach for Large-Scale Polymorphism Discovery in Complex Genomes , 2007, PloS one.

[35]  Santosh Kumar,et al.  Genome wide SNP discovery in flax through next generation sequencing of reduced representation libraries , 2012, BMC Genomics.

[36]  P. Vos,et al.  AFLP: a new technique for DNA fingerprinting. , 1995, Nucleic acids research.

[37]  Zhiwu Zhang,et al.  Genotyping by Genome Reducing and Sequencing for Outbred Animals , 2013, PloS one.

[38]  Z. Xuan,et al.  Genome-wide in situ exon capture for selective resequencing , 2007, Nature Genetics.

[39]  M. Groenen,et al.  Structural variation in the chicken genome identified by paired-end next-generation DNA sequencing of reduced representation libraries , 2011, BMC Genomics.

[40]  M. Blaxter,et al.  Genome-wide genetic marker discovery and genotyping using next-generation sequencing , 2011, Nature Reviews Genetics.