Gene-based SNP identification and validation in soybean using next-generation transcriptome sequencing

Gene-based molecular markers are increasingly used in crop breeding programs for marker-assisted selection. However, identification of genetic variants associated with important agronomic traits has remained a difficult task in soybean. RNA-Seq provides an efficient way, other than assessing global expression variations of coding genes, to discover gene-based SNPs at the whole genome level. In this study, RNA isolated from four soybean accessions each with three replications was subjected to high-throughput sequencing and a range of 44.2–65.9 million paired-end reads were generated for each library. A total of 75,209 SNPs were identified among different genotypes after combination of replications, 89.1% of which were located in expressed regions and 27.0% resulted in amino acid changes. GO enrichment analysis revealed that most significant enriched genes with nonsynonymous SNPs were involved in ribonucleotide binding or catalytic activity. Of 22 SNPs subjected to PCR amplification and Sanger sequencing, all of them were validated. To test the utility of identified SNPs, these validated SNPs were also assessed by genotyping a relative large population with 393 wild and cultivated soybean accessions. These SNPs identified by RNA-Seq provide a useful resource for genetic and genomic studies of soybean. Moreover, the collection of nonsynonymous SNPs annotated with their predicted functional effects also provides a valuable asset for further discovery of genes, identification of gene variants, and development of functional markers.

[1]  S. Dwivedi,et al.  Primer premier: program for design of degenerate primers from a protein sequence. , 1998, BioTechniques.

[2]  T. A. Hall,et al.  BIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT , 1999 .

[3]  Thomas Lübberstedt,et al.  Functional markers in plants. , 2003, Trends in plant science.

[4]  Kejun Liu,et al.  PowerMarker: an integrated analysis environment for genetic marker analysis , 2005, Bioinform..

[5]  Bernard R. Baum,et al.  Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components , 1997, Plant Molecular Biology Reporter.

[6]  K. Chase,et al.  A Soybean Transcript Map: Gene Distribution, Haplotype and Single-Nucleotide Polymorphism Analysis , 2007, Genetics.

[7]  Baohui Liu,et al.  Genetic Redundancy in Soybean Photoresponses Associated With Duplication of the Phytochrome A Gene , 2008, Genetics.

[8]  Eric T. Wang,et al.  Alternative Isoform Regulation in Human Tissue Transcriptomes , 2008, Nature.

[9]  S. Tabata,et al.  Map-Based Cloning of the Gene Associated With the Soybean Maturity Locus E3 , 2009, Genetics.

[10]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[11]  Liuda Ziaugra,et al.  SNP Genotyping Using the Sequenom MassARRAY iPLEX Platform , 2009, Current protocols in human genetics.

[12]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[13]  G. Hartman,et al.  Crops that feed the World 2. Soybean—worldwide production, use, and constraints caused by pathogens and pests , 2011, Food Security.

[14]  J. Schmutz,et al.  Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome , 2010, Proceedings of the National Academy of Sciences.

[15]  Henry D. Priest,et al.  Genome-wide mapping of alternative splicing in Arabidopsis thaliana. , 2010, Genome research.

[16]  Trupti Joshi,et al.  An integrated transcriptome atlas of the crop model Glycine max, and its use in comparative analyses in plants. , 2010, The Plant journal : for cell and molecular biology.

[17]  Bo Wang,et al.  Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection , 2010, Nature Genetics.

[18]  Zhou Du,et al.  agriGO: a GO analysis toolkit for the agricultural community , 2010, Nucleic Acids Res..

[19]  J. Specht,et al.  Artificial selection for determinate growth habit in soybean , 2010, Proceedings of the National Academy of Sciences.

[20]  Guriqbal Singh,et al.  The Soybean: Botany, Production and Uses , 2010 .

[21]  T. Sakurai,et al.  Genome sequence of the palaeopolyploid soybean , 2010, Nature.

[22]  J. Shannon,et al.  Mutant alleles of FAD2-1A and FAD2-1B combine to produce soybeans with the high oleic acid seed oil trait , 2010, BMC Plant Biology.

[23]  Steven J. M. Jones,et al.  SNP discovery in black cottonwood (Populus trichocarpa) by population transcriptome resequencing , 2011, Molecular ecology resources.

[24]  S. Tabata,et al.  A Map-Based Cloning Strategy Employing a Residual Heterozygous Line Reveals that the GIGANTEA Gene Is Involved in Soybean Maturity and Flowering , 2011, Genetics.

[25]  J. Valkonen,et al.  Advances in plant gene-targeted and functional markers: a review , 2013, Plant Methods.

[26]  M. Boussaha,et al.  Gene-based single nucleotide polymorphism discovery in bovine muscle using next-generation transcriptomic sequencing , 2013, BMC Genomics.

[27]  Parijat S Juvale,et al.  A soybean cyst nematode resistance gene points to a new mechanism of plant resistance to pathogens , 2012, Nature.

[28]  M. Carazzolle,et al.  Identification of SNPs in RNA-seq data of two cultivars of Glycine max (soybean) differing in drought resistance , 2012, Genetics and molecular biology.

[29]  Jun Wang,et al.  Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing , 2013, BMC Genomics.

[30]  T. Yamazaki,et al.  Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering , 2012, Proceedings of the National Academy of Sciences.

[31]  Pablo Cingolani,et al.  © 2012 Landes Bioscience. Do not distribute. , 2022 .

[32]  S. Lee,et al.  Ln Is a Key Regulator of Leaflet Shape and Number of Seeds per Pod in Soybean[W] , 2012, Plant Cell.

[33]  M. Wopereis,et al.  Crops that feed the world 7: Rice , 2012, Food Security.

[34]  Randall L. Nelson,et al.  Development and Evaluation of SoySNP50K, a High-Density Genotyping Array for Soybean , 2013, PloS one.

[35]  Pengyin Chen,et al.  Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content , 2014, BMC Genomics.

[36]  SNP-Based Genetic Linkage Map of Soybean Using the SoySNP6K Illumina Infinium BeadChip Genotyping Array , 2014 .

[37]  Baohui Liu,et al.  Genetic variation in four maturity genes affects photoperiod insensitivity and PHYA-regulated post-flowering responses of soybean , 2013, BMC Plant Biology.

[38]  L. Vodkin,et al.  Using RNA-Seq to Profile Soybean Seed Development from Fertilization to Maturity , 2013, PloS one.

[39]  Lawrence Carin,et al.  An integrated transcriptome and expressed variant analysis of sepsis survival and death , 2014, Genome Medicine.

[40]  Ruiqiang Li,et al.  De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits , 2014, Nature Biotechnology.

[41]  S. Jackson,et al.  Decreased Nucleotide and Expression Diversity and Modified Coexpression Patterns Characterize Domestication in the Common Bean[W][OPEN] , 2014, Plant Cell.

[42]  Hong-Kyu Choi,et al.  Population Structure and Domestication Revealed by High-Depth Resequencing of Korean Cultivated and Wild Soybean Genomes , 2013, DNA research : an international journal for rapid publication of reports on genes and genomes.

[43]  Zhixi Tian,et al.  Global Dissection of Alternative Splicing in Paleopolyploid Soybean[W] , 2014, Plant Cell.

[44]  M. R. Baring,et al.  Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L. , 2015, Molecular Genetics and Genomics.

[45]  J. Specht,et al.  Dt2 Is a Gain-of-Function MADS-Domain Factor Gene That Specifies Semideterminacy in Soybean[C][W] , 2014, Plant Cell.

[46]  H. Nguyen,et al.  High-throughput and functional SNP detection assays for oleic and linolenic acids in soybean , 2015, Molecular Breeding.

[47]  Hui Xiang,et al.  Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean , 2015, Nature Biotechnology.

[48]  Liming Xu,et al.  RNA-Seq Uncovers SNPs and Alternative Splicing Events in Asian Lotus (Nelumbo nucifera) , 2015, PloS one.

[49]  K. Meksem,et al.  SNP identification and marker assay development for high-throughput selection of soybean cyst nematode resistance , 2015, BMC Genomics.

[50]  A. Pirani,et al.  Development, validation and genetic analysis of a large soybean SNP genotyping array. , 2015, The Plant journal : for cell and molecular biology.

[51]  T. Joshi,et al.  Whole-genome gene expression profiling revealed genes and pathways potentially involved in regulating interactions of soybean with cyst nematode (Heterodera glycines Ichinohe) , 2015, BMC Genomics.

[52]  Y. Liu,et al.  Geographical distribution of GmTfl1 alleles in Chinese soybean varieties , 2015 .

[53]  R. O'Leary,et al.  Waterlogging tolerance is associated with root porosity in barley (Hordeum vulgare L.) , 2015, Molecular Breeding.

[54]  Yang Liu,et al.  Evaluation of genetic variation among Brazilian soybean cultivars through genome resequencing , 2016, BMC Genomics.

[55]  Hui Xiang,et al.  Erratum: Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean , 2015, Nature Biotechnology.

[56]  D. Alekel,et al.  Soybean and the Prevention of Chronic Human Disease , 2016 .