simplePHENOTYPES: SIMulation of pleiotropic, linked and epistatic phenotypes

Motivation Advances in genotyping and phenotyping techniques have enabled the acquisition of a great amount of data. Consequently, there is an interest in multivariate statistical analyses that identify genomic regions likely to contain causal mutations affecting multiple traits (i.e., pleiotropy). As the demand for multivariate analyses increases, it is imperative that optimal tools are available to compare different implementations of these analyses. To facilitate the testing and validation of these multivariate approaches, we developed simplePHENOTYPES, an R package that simulates pleiotropy, partial pleiotropy, and spurious pleiotropy in a wide range of genetic architectures, including additive, dominance and epistatic models. Results We illustrate simplePHENOTYPES’ ability to simulate thousands of phenotypes in less than one minute. We then provide a vignette illustrating how to simulate a set of correlated traits in simplePHENOTYPES. Finally, we demonstrate the use of results from simplePHENOTYPES in a standard GWAS software, as well as the numerical equivalence of simulated phenotypes from simplePHENOTYPES and other packages with similar capabilities. Conclusions simplePHENOTYPES is a CRAN package that makes it possible to simulate multiple traits controlled by loci with varying degrees of pleiotropy. Its ability to interface with both commonly-used marker data formats and downstream quantitative genetics software and packages should facilitate a rigorous assessment of both existing and emerging statistical GWAS and GS approaches. simplePHENOTYPES is also available at https://github.coin/sainuelbfernandes/siinplePHENOTYPES.

[1]  Gregor Gorjanc,et al.  AlphaSim: Software for Breeding Program Simulation , 2016, The plant genome.

[2]  Heather F. Porter,et al.  Multivariate simulation framework reveals performance of multi-trait GWAS methods , 2017, Scientific Reports.

[3]  Edward S. Buckler,et al.  TASSEL: software for association mapping of complex traits in diverse samples , 2007, Bioinform..

[4]  Miroslav Vořechovský,et al.  Generalization of coloring linear transformation , 2019 .

[5]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[6]  Meng Li,et al.  Genetics and population analysis Advance Access publication July 13, 2012 , 2012 .

[7]  Jeffrey B. Endelman,et al.  Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP , 2011 .

[8]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[9]  R. Punnett,et al.  The Genetical Theory of Natural Selection , 1930, Nature.

[10]  J. Poland,et al.  High-Throughput Phenotyping Enabled Genetic Dissection of Crop Lodging in Wheat , 2019, Front. Plant Sci..

[11]  José Crossa,et al.  High-throughput phenotyping and genomic selection: the frontiers of crop breeding converge. , 2012, Journal of integrative plant biology.

[12]  L. Xiong,et al.  Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice , 2014, Nature Communications.

[13]  John Doebley,et al.  Maize association population: a high-resolution platform for quantitative trait locus dissection. , 2005, The Plant journal : for cell and molecular biology.

[14]  Jeffrey Ross-Ibarra,et al.  Genetic Architecture of Maize Kernel Composition in the Nested Association Mapping and Inbred Association Panels1[W] , 2011, Plant Physiology.

[15]  Ewan Birney,et al.  PhenotypeSimulator: A comprehensive framework for simulating multi-trait, multi-locus genotype to phenotype relationships , 2018, Bioinform..