Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops

Abstract A genetic model with genotype×environment (GE) interactions for controlling systematical errors in the field can be used for predicting genotypic values by an adjusted unbiased prediction (AUP) method. Mahalanobis distance, calculated based on the genotypic values, is then applied to measure the genetic distance among accessions. The unweighted pair-group average, Ward’s and the complete linkage methods of hierarchical clustering combined with three sampling strategies are proposed to construct core collections in a procedure of stepwise clustering. A homogeneous test and t-tests are suggested for use in testing variances and means, respectively. The coincidence rate (CR%) for range and the variable rate (VR%) for the coefficient of variation are designed to evaluate the property of core collections. A worked example of constructing core collections in cotton with 21 traits was conducted. Random sampling can represent the genetic diversity structure of the initial collection. Preferred sampling can keep the accessions with special or valuable characteristics in the initial collection. Deviation sampling can retain the larger genetic variability of the initial collection. For better representation of the core collection, cluster methods should be combined with different sampling strategies. The core collections based on genotypic values retained larger genetic variability and had superior representatives than those based on phenotypic values.