A Model Selection Approach for the Identification of Quantitative Trait Loci in Experimental Crosses, Allowing Epistasis

The identification of quantitative trait loci (QTL) and their interactions is a crucial step toward the discovery of genes responsible for variation in experimental crosses. The problem is best viewed as one of model selection, and the most important aspect of the problem is the comparison of models of different sizes. We present a penalized likelihood approach, with penalties on QTL and pairwise interactions chosen to control false positive rates. This extends the work of Broman and Speed to allow for pairwise interactions among QTL. A conservative version of our penalized LOD score provides strict control over the rate of extraneous QTL and interactions; a more liberal criterion is more lenient on interactions but seeks to maintain control over the rate of inclusion of false loci. The key advance is that one needs only to specify a target false positive rate rather than a prior on the number of QTL and interactions. We illustrate the use of our model selection criteria as exploratory tools; simulation studies demonstrate reasonable power to detect QTL. Our liberal criterion is comparable in power to two Bayesian approaches.

[1]  M. Sillanpää,et al.  Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. , 1998, Genetics.

[2]  Z. Zeng Precision mapping of quantitative trait loci. , 1994, Genetics.

[3]  R. Hudson,et al.  The use of sample genealogies for studying a selectively neutral m-loci model with recombination. , 1985, Theoretical population biology.

[4]  J. Wall,et al.  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels. , 2001, American journal of human genetics.

[5]  K. Strimmer,et al.  Inferring confidence sets of possibly misspecified gene trees , 2002, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[6]  Nengjun Yi,et al.  Mapping quantitative trait loci with epistatic effects. , 2002, Genetical research.

[7]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[8]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[9]  J. Satagopan Estimating the number of quantitative trait loci via Bayesian model determination , 1996 .

[10]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[11]  J. Ooijen,et al.  Biometrics in Plant Breeding: Applications of Molecular Markers. , 1996 .

[12]  R. Jansen,et al.  Interval mapping of multiple quantitative trait loci. , 1993, Genetics.

[13]  A. Rodrigo,et al.  Likelihood-based tests of topologies in phylogenetics. , 2000, Systematic biology.

[14]  K. Broman Identifying Quantitative Trait Loci in Experimental Crosses , 1997 .

[15]  Pavel A. Pevzner,et al.  Transforming cabbage into turnip: polynomial algorithm for sorting signed permutations by reversals , 1995, JACM.

[16]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[17]  S. Xu,et al.  A penalized maximum likelihood method for estimating epistatic effects of QTL , 2005, Heredity.

[18]  B. Efron,et al.  Bootstrap confidence levels for phylogenetic trees. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[19]  J C Whittaker,et al.  On the mapping of QTL by regression of phenotype on marker-type , 1996, Heredity.

[20]  L. Penrose,et al.  THE CORRELATION BETWEEN RELATIVES ON THE SUPPOSITION OF MENDELIAN INHERITANCE , 2022 .

[21]  Hao Wu,et al.  R/qtl: QTL Mapping in Experimental Crosses , 2003, Bioinform..

[22]  D. Siegmund Model selection in irregular problems: Applications to mapping quantitative trait loci , 2004 .

[23]  S. Dudoit,et al.  Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. , 2002, Nucleic acids research.

[24]  柴田 里程 Selection of regression variables , 1981 .

[25]  T. Speed,et al.  A review of methods for identifying QTLs in experimental crosses , 1999 .

[26]  Rebecca W. Doerge,et al.  Statistical issues in the search for genes affecting quantitative traits in experimental populations , 1997 .

[27]  K. Broman,et al.  Significance Thresholds for Quantitative Trait Locus Mapping Under Selective Genotyping , 2007, Genetics.

[28]  G. McVean,et al.  Estimating recombination rates from population-genetic data , 2003, Nature Reviews Genetics.

[29]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[30]  A. Zwinderman,et al.  Simultaneous estimation of gene‐gene and gene‐environment interactions for numerous loci using double penalized log–likelihood , 2006, Genetic epidemiology.

[31]  Gary A. Churchill,et al.  The X Chromosome in Quantitative Trait Locus Mapping , 2006, Genetics.

[32]  J. Ghosh,et al.  Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait Loci , 2004, Genetics.

[33]  Robert C. Griffiths,et al.  Asymptotic line-of-descent distributions , 1984 .

[34]  E. Boerwinkle,et al.  DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene , 1998, Nature Genetics.

[35]  Richard A. Nichols,et al.  A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity , 2008, Genetica.

[36]  S. Sisson An algorithm to characterize non-communicating classes on complex genealogies , 2003 .

[37]  B Rannala,et al.  Estimating gene flow in island populations. , 1996, Genetical research.

[38]  R. Jansen,et al.  University of Groningen High Resolution of Quantitative Traits Into Multiple Loci via Interval Mapping , 2022 .

[39]  A. P. Dawid,et al.  Hierarchical models for DNA profiling using heterogeneous databases , 1999 .

[40]  M. Sillanpää,et al.  Model choice in gene mapping: what and why. , 2002, Trends in genetics : TIG.

[41]  D. Botstein,et al.  Diversity of gene expression in adenocarcinoma of the lung , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Chris A. Glasbey,et al.  Combinatorial image analysis of DNA microarray features , 2003, Bioinform..

[43]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[44]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[45]  Z. Zeng,et al.  Multiple interval mapping for quantitative trait loci. , 1999, Genetics.

[46]  P. Visscher,et al.  Detection of putative quantitative trait loci in line crosses under infinitesimal genetic models , 1996, Theoretical and Applied Genetics.

[47]  A. Sali,et al.  Protein Structure Prediction and Structural Genomics , 2001, Science.

[48]  Karl W. Broman,et al.  A model selection approach for the identification of quantitative trait loci in experimental crosses , 2002 .

[49]  R. Hudson Two-locus sampling distributions and their application. , 2001, Genetics.

[50]  T. Brody,et al.  On the power of experimental designs for the detection of linkage between marker loci and quantitative loci in crosses between inbred lines , 2004, Theoretical and Applied Genetics.

[51]  Barbara Godlee,et al.  The Private Life of the Brain , 2001 .

[52]  Z B Zeng,et al.  Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci. , 1993, Proceedings of the National Academy of Sciences of the United States of America.

[53]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[54]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[55]  A. Edwards,et al.  Estimation of the Branch Points of a Branching Diffusion Process , 1970 .

[56]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[57]  M. Edwards,et al.  Evaluation of marker-assisted selection through computer simulation , 1994, Theoretical and Applied Genetics.

[58]  J. Felsenstein Maximum-likelihood estimation of evolutionary trees from continuous characters. , 1973, American journal of human genetics.

[59]  A. Atkinson Subset Selection in Regression , 1992 .

[60]  Zvi Drezner,et al.  Tabu search model selection in multiple regression analysis , 1999 .

[61]  Angela C. Poole,et al.  QTL analysis of self-selected macronutrient diet intake: fat, carbohydrate, and total kilocalories. , 2002, Physiological genomics.

[62]  Nengjun Yi,et al.  Bayesian model choice and search strategies for mapping interacting quantitative trait Loci. , 2003, Genetics.

[63]  Shengchu Wang Simulation Study on the Methods for Mapping Quantitative Trait Loci in Inbred Line Crosses , 2000 .

[64]  Chih-Ling Tsai,et al.  MODEL SELECTION FOR MULTIVARIATE REGRESSION IN SMALL SAMPLES , 1994 .

[65]  V. Loeschcke,et al.  Conservation Genetics , 2019, Handbook of Statistical Genomics.

[66]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[67]  R. Jansen,et al.  A penalized likelihood method for mapping epistatic quantitative trait Loci with one-dimensional genome searches. , 2002, Genetics.

[68]  Ian W. Evett,et al.  Bayesian Analysis of DNA Profiling Data in Forensic Identification Applications , 1997 .

[69]  M A Newton,et al.  A bayesian approach to detect quantitative trait loci using Markov chain Monte Carlo. , 1996, Genetics.

[70]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[71]  C. Haley,et al.  A simple regression method for mapping quantitative trait loci in line crosses using flanking markers , 1992, Heredity.

[72]  G A Churchill,et al.  Concordance of murine quantitative trait loci for salt-induced hypertension with rat and human loci. , 2001, Genomics.

[73]  G A Churchill,et al.  Genome-wide epistatic interaction analysis reveals complex genetic determinants of circadian behavior in mice. , 2001, Genome research.

[74]  B. Mangin,et al.  Comparing methods to detect more than one QTL on a chromosome , 1998, Theoretical and Applied Genetics.

[75]  Z B Zeng,et al.  Estimating the genetic architecture of quantitative traits. , 1999, Genetical research.

[76]  H. Akaike Fitting autoregressive models for prediction , 1969 .

[77]  J. Whittaker,et al.  Using marker-maps in marker-assisted selection. , 1995, Genetical research.

[78]  Nengjun Yi,et al.  Bayesian Model Selection for Genome-Wide Epistatic Quantitative Trait Loci Analysis , 2005, Genetics.

[79]  P. Visscher,et al.  Mapping multiple QTL of different effects: comparison of a simple sequential testing strategy and multiple QTL mapping , 2000, Molecular Breeding.

[80]  Elizabeth A. Thompson,et al.  Monte Carlo Methods on Genetic Structures , 2000 .

[81]  Harshinder Singh,et al.  Statistical thermodynamics of hindered rotation from computer simulations , 2001 .

[82]  Z B Zeng,et al.  Genetic architecture of a morphological shape difference between two Drosophila species. , 2000, Genetics.

[83]  R. Ball,et al.  Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using the Bayesian information criterion. , 2001, Genetics.

[84]  K. Strimmer,et al.  Quartet Puzzling: A Quartet Maximum-Likelihood Method for Reconstructing Tree Topologies , 1996 .

[85]  R. Doerge,et al.  Permutation tests for multiple loci affecting a quantitative character. , 1996, Genetics.

[86]  Calyampudi R. Rao,et al.  A strongly consistent procedure for model selection in a regression problem , 1989 .

[87]  K. Weber,et al.  An analysis of polygenes affecting wing shape on chromosome 2 in Drosophila melanogaster. , 2001, Genetics.

[88]  R. Hudson,et al.  Statistical properties of the number of recombination events in the history of a sample of DNA sequences. , 1985, Genetics.

[89]  B. Kinghorn,et al.  The use of a genetic algorithm for simultaneous mapping of multiple interacting quantitative trait loci. , 2000, Genetics.

[90]  Richard Durrett,et al.  Bayesian Estimation of the Number of Inversions in the History of Two Chromosomes , 2002, J. Comput. Biol..

[91]  E. Boerwinkle,et al.  Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase. , 1998, American journal of human genetics.

[92]  J. Cheverud Genetics and analysis of quantitative traits , 1999 .

[93]  Hao Wu,et al.  R/qtlbim: QTL with Bayesian Interval Mapping in experimental crosses , 2007, Bioinform..

[94]  Ib M. Skovgaard,et al.  Mapping Quantitative Trait Loci by an Extension of the Haley–Knott Regression Method Using Estimating Equations , 2006, Genetics.

[95]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[96]  D. Aldous Stochastic models and descriptive statistics for phylogenetic trees, from Yule to today , 2001 .

[97]  Andreas Baierl,et al.  On Locating Multiple Interacting Quantitative Trait Loci in Intercross Designs , 2006, Genetics.

[98]  T. Speed,et al.  Chromosomes X, 9, and the H2 locus interact epistatically to control Leishmania major infection , 1999, European journal of immunology.

[99]  Giovanni Parmigiani,et al.  Meta-Analysis of Migraine Headache Treatments: Combining Information from Heterogeneous Designs , 1999 .

[100]  W. Ewens Genetics and analysis of quantitative traits , 1999 .

[101]  G. Churchill,et al.  A statistical framework for quantitative trait mapping. , 2001, Genetics.

[102]  A. Robertson,et al.  The isolation of polygenic factors controlling bristle score in Drosophila melanogaster. I. Allocation of third chromosome sternopleural bristle effects to chromosome sections. , 1988, Genetics.