An Efficient and Scalable Implementation of SNP-Pair Interaction Testing for Genetic Association Studies

This paper describes a scalable approach to one of the most computationally intensive problems in molecular plant breeding, that of associating quantitative traits with genetic markers. The fundamental problem is to build statistical correlations between particular loci in the genome of an individual plant and the expressed characteristics of that individual. While applied to plants in this paper, the problem generalizes to mapping genotypes to phenotypes across all biology. In this work, a formulation of a statistical approach for identifying pair wise interactions is presented. The implementation, optimization and parallelization of this approach are then presented, with scalability results.

[1]  J. Holland,et al.  Genetic architecture of complex traits in plants. , 2007, Current opinion in plant biology.

[2]  P. Sham,et al.  Application of genome-wide SNP data for uncovering pairwise relationships and quantitative trait loci , 2009, Genetica.

[3]  P. Craufurd,et al.  Climate change and the flowering time of annual crops. , 2009, Journal of experimental botany.

[4]  Li Ma,et al.  Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies , 2008, BMC Bioinformatics.

[5]  Graeme L. Hammer,et al.  Preface to Special Issue: Complex traits and plant breeding—can we understand the complexities of gene-to-phenotype relationships and use such knowledge to enhance plant breeding outcomes? , 2005 .

[6]  Pjotr Prins,et al.  R/qtl: high-throughput multiple QTL mapping , 2010, Bioinform..

[7]  F. V. van Eeuwijk,et al.  Detection and use of QTL for complex traits in multiple environments. , 2010, Current opinion in plant biology.

[8]  Ioannis Xenarios,et al.  FastEpistasis: a high performance computing solution for quantitative trait epistasis , 2010, Bioinform..

[9]  Z. Zeng,et al.  Current Progress on Statistical Methods for Mapping Quantitative Trait Loci from Inbred Line Crosses , 2010, Journal of biopharmaceutical statistics.

[10]  G. Khush What it will take to Feed 5.0 Billion Rice consumers in 2030 , 2005, Plant Molecular Biology.

[11]  A. Paterson,et al.  Epistasis for three grain yield components in rice (Oryza sativa L.). , 1997, Genetics.

[12]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[13]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[14]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[15]  B. Walsh,et al.  Models for navigating biological complexity in breeding improved crop plants. , 2006, Trends in plant science.

[16]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[17]  G. Hammer,et al.  Modeling QTL for complex traits: detection and context for plant breeding. , 2009, Current opinion in plant biology.

[18]  S. Long,et al.  Global food insecurity. Treatment of major food crops with elevated carbon dioxide or ozone under large-scale fully open-air conditions suggests recent models may have overestimated future yields , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[19]  R. Mittler,et al.  Genetic engineering for modern agriculture: challenges and perspectives. , 2010, Annual review of plant biology.

[20]  Tao Peng,et al.  PBEAM: A parallel implementation of BEAM for genome-wide inference of epistatic interactions , 2009, Bioinformation.