A Strategy for Finding the Optimal Scale of Plant Core Collection Based on Monte Carlo Simulation

Core collection is an ideal resource for genome-wide association studies (GWAS). A subcore collection is a subset of a core collection. A strategy was proposed for finding the optimal sampling percentage on plant subcore collection based on Monte Carlo simulation. A cotton germplasm group of 168 accessions with 20 quantitative traits was used to construct subcore collections. Mixed linear model approach was used to eliminate environment effect and GE (genotype × environment) effect. Least distance stepwise sampling (LDSS) method combining 6 commonly used genetic distances and unweighted pair-group average (UPGMA) cluster method was adopted to construct subcore collections. Homogeneous population assessing method was adopted to assess the validity of 7 evaluating parameters of subcore collection. Monte Carlo simulation was conducted on the sampling percentage, the number of traits, and the evaluating parameters. A new method for “distilling free-form natural laws from experimental data” was adopted to find the best formula to determine the optimal sampling percentages. The results showed that coincidence rate of range (CR) was the most valid evaluating parameter and was suitable to serve as a threshold to find the optimal sampling percentage. The principal component analysis showed that subcore collections constructed by the optimal sampling percentages calculated by present strategy were well representative.

[1]  Chongrong Wang,et al.  Functional markers developed from multiple loci in GS3 for fine marker-assisted selection of grain length in rice , 2011, Theoretical and Applied Genetics.

[2]  Rajeev K. Varshney,et al.  Pigeonpea composite collection and identification of germplasm for use in crop improvement programmes , 2011, Plant Genetic Resources.

[3]  K. Yonezawa,et al.  Sampling strategies for use in stratified germplasm collections , 1995 .

[4]  Qian Qian,et al.  Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm , 2011, Nature Genetics.

[5]  Hari D. Upadhyaya,et al.  Development of Core Subset of Finger Millet Germplasm Using Geographical Origin and Data on 14 Quantitative Traits , 2006, Genetic Resources and Crop Evolution.

[6]  Douglas B Kell,et al.  Scientific discovery as a combinatorial optimisation problem: How best to navigate the landscape of possible experiments? , 2012, BioEssays : news and reviews in molecular, cellular and developmental biology.

[7]  Abhishek Rathore,et al.  New sources of resistance to Fusarium wilt and sterility mosaic disease in a mini-core collection of pigeonpea germplasm , 2012, European Journal of Plant Pathology.

[8]  Annalisa Imperato,et al.  Worldwide Core Collection of Olive Cultivars Based on Simple Sequence Repeat and Morphological Markers , 2012 .

[9]  Jin Hu,et al.  Assessment on Evaluating Parameters of Rice Core Collections Constructed by Genotypic Values and Molecular Marker Information , 2007 .

[10]  C L L Gowda,et al.  Augmenting the Pearl Millet Core Collection for Enhancing Germplasm Utilization in Crop Improvement , 2009 .

[11]  C. Cruz,et al.  Development of a Brazilian maize core collection , 2009, Genetics and molecular biology.

[12]  Jin Hu,et al.  Establishing an Efficient Way to Utilize the Drought Resistance Germplasm Population in Wheat , 2013, TheScientificWorldJournal.

[13]  Xiaohong Yang,et al.  Genome-wide association study dissects the genetic architecture of oil biosynthesis in maize kernels , 2012, Nature Genetics.

[14]  Jin Hu,et al.  Assessment of different genetic distances in constructing cotton core subset by genotypic values , 2008, Journal of Zhejiang University SCIENCE B.

[15]  A. Brown,et al.  Core collections: a practical approach to genetic resources management , 1989 .

[16]  C. Richards,et al.  Retention of agronomically important variation in germplasm core collections: implications for allele mining , 2012, Theoretical and Applied Genetics.

[17]  C L L Gowda,et al.  Developing a Mini‐Core Collection in Finger Millet Using Multilocation Data , 2010 .

[18]  M. Pavelek,et al.  Genetic diversity of cultivated flax (Linum usitatissimum L.) germplasm assessed by retrotransposon-based markers , 2011, Theoretical and Applied Genetics.

[19]  Luis G. Santesteban,et al.  Assessment of the genetic and phenotypic diversity maintained in apple core collections constructed by using either agro-morphologic or molecular marker data , 2009 .

[20]  C. Spillane,et al.  Core collections of plant genetic resources. , 2000 .

[21]  E. S. Rao,et al.  Using SSR markers to map genetic diversity and population structure of Solanum pimpinellifolium for development of a core collection , 2011, Plant Genetic Resources.

[22]  Lynne Carpenter-Boggs,et al.  Nitrogen fixation potential in global chickpea mini-core collection , 2011, Biology and Fertility of Soils.

[23]  Qifa Zhang,et al.  Genome-wide association studies of 14 agronomic traits in rice landraces , 2010, Nature Genetics.

[24]  Detlef Weigel,et al.  Next-generation genetics in plants , 2008, Nature.

[25]  Ian D. Godwin,et al.  Maximizing genetic, morphological, and geographic diversity in a core collection of Australian bermudagrass , 2012 .

[26]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[27]  J. Zhu,et al.  Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops , 2000, Theoretical and Applied Genetics.

[28]  Helmut Knüpffer,et al.  Analysis of the contribution of Mesoamerican and Andean gene pools to European common bean (Phaseolus vulgaris L.) germplasm and strategies to establish a core collection , 2007, Genetic Resources and Crop Evolution.

[29]  Chenyang Hao,et al.  Identification and development of a functional marker of TaGW2 associated with grain weight in bread wheat (Triticum aestivum L.) , 2010, Theoretical and Applied Genetics.

[30]  Guusje Bonnema,et al.  The patterns of population differentiation in a Brassica rapa core collection , 2010, Theoretical and Applied Genetics.

[31]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[32]  J. Chen,et al.  Genome-wide genetic changes during modern breeding of maize , 2012, Nature Genetics.

[33]  Ganapati Mukri,et al.  Variability and Stability Analysis for Nutritional Traits in the Mini Core Collection of Peanut , 2012 .

[34]  B. S. Weir,et al.  Diallel analysis for sex-linked and maternal effects , 2004, Theoretical and Applied Genetics.

[35]  Hongmei Ge,et al.  The wheat (T. aestivum) sucrose synthase 2 gene (TaSus2) active in endosperm development is associated with yield traits , 2011, Functional & Integrative Genomics.

[36]  J. C. Wang,et al.  A strategy on constructing core collections by least distance stepwise sampling , 2007, Theoretical and Applied Genetics.

[37]  Jin Hu,et al.  Effect of the scale of quantitative trait data on the representativeness of a cotton germplasm sub-core collection , 2013, Journal of Zhejiang University SCIENCE B.

[38]  Iksoo Kim,et al.  Complete nucleotide sequence and organization of the mitogenome of endangered Eumenis autonoe (Lepidoptera: Nymphalidae) , 2010 .

[39]  R. Nelson,et al.  Establishing a soybean germplasm core collection , 2010 .

[40]  Wang Yi,et al.  Sampling strategy to develop a primary core collection of apple cultivars based on fruit traits , 2010 .