Analysis of genetic marker-phenotype relationships by jack-knifed partial least squares regression (PLSR).

The utility of a relatively new multivariate method, bi-linear modelling by cross-validated partial least squares regression (PLSR), was investigated in the analysis of QTL. The distinguishing feature of PLSR is to reveal reliable covariance structures in data of different types with regard to the same set objects. Two matrices X (here: genetic markers) and Y (here: phenotypes) are interactively decomposed into latent variables (PLS components, or PCs) in a way which facilitates statistically reliable and graphically interpretable model building. Natural collinearities between input variables are utilized actively to stabilise the modelling, instead of being treated as a statistical problem. The importance of cross-validation/jack-knifing as an intuitively appealing way to avoid overfitting, is emphasized. Two datasets from chromosomal mapping studies of different complexity were chosen for illustration (QTL for tomato yield and for oat heading date). Results from PLSR analysis were compared to published results and to results using the package PLABQTL in these data sets. In all cases PLSR gave at least similar explained validation variances as the reported studies. An attractive feature is that PLSR allows the analysis of several traits/replicates in one analysis, and the direct visual identification of individuals with desirable marker genotypes. It is suggested that PLSR may be useful in structural and functional genomics and in marker assisted selection, particularly in cases with limited number of objects.

[1]  H. Martens,et al.  Multivariate analysis of quality , 2000 .

[2]  J. Josse,et al.  Quantitative trait loci underlying gene product variation: a novel perspective for analyzing regulation of genome expression. , 1994, Genetics.

[3]  F. V. van Eeuwijk,et al.  Interpreting genotype × environment interaction in tropical maize using linked molecular markers and environmental covariables , 1999, Theoretical and Applied Genetics.

[4]  H. Martens,et al.  ANOVA Interactions Interpreted by Partial Least Squares Regression , 1986 .

[5]  R. Doerge,et al.  Permutation tests for multiple loci affecting a quantitative character. , 1996, Genetics.

[6]  M. Kearsey,et al.  QTL analysis in plants; where are we now? , 1998, Heredity.

[7]  K. Sayre,et al.  Interpreting Genotype ✕ Environment Interaction in Wheat by Partial Least Squares Regression , 1998 .

[8]  J. Crossa,et al.  Resistance to barley scald (Rhynchosporium secalis) in the Ethiopian donor lines 'Steudelli' and 'Jet', analyzed by partial least squares regression and interval mapping. , 2004, Hereditas.

[9]  Jb Holland Computer note. EPISTACY: A SAS program for detecting two-locus epistatic interactions using genetic marker information , 1998 .

[10]  Z. Zeng,et al.  Multiple interval mapping for quantitative trait loci. , 1999, Genetics.

[11]  A. Korol,et al.  Enhanced efficiency of quantitative trait loci mapping analysis based on multivariate complexes of quantitative traits. , 2001, Genetics.

[12]  A B Korol,et al.  Interval mapping of quantitative trait loci employing correlated trait complexes. , 1995, Genetics.

[13]  Z B Zeng,et al.  Multiple trait analysis of genetic mapping for quantitative trait loci. , 1995, Genetics.

[14]  F. V. van Eeuwijk,et al.  Interpreting Treatment × Environment Interaction in Agronomy Trials , 2001 .

[15]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[16]  H. Martens,et al.  Relationships between storage protein composition, protein content, growing season and flour quality of bread wheat , 2004 .

[17]  C. Schön,et al.  Bias and Sampling Error of the Estimated Proportion of Genotypic Variance Explained by Quantitative Trait Loci Determined From Experimental Data in Maize Using Cross Validation and Validation With Independent Samples. , 2000, Genetics.

[18]  M. Asins,et al.  Present and future of quantitative trait locus analysis in plant breeding , 2002 .

[19]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[20]  M. Sorrells,et al.  Chromosomal regions associated with quantitative traits in oat , 1996 .