qgg: an R package for large-scale quantitative genetic analyses

Summary Studies of complex traits and diseases are strongly dependent on the availability of user-friendly software designed to handle large-scale genetic and phenotypic data. Here, we present the R package qgg, which provides an environment for large-scale genetic analyses of quantitative traits and disease phenotypes. The qgg package provides an infrastructure for efficient processing of large-scale genetic data and functions for estimating genetic parameters, performing single and multiple marker association analyses, and genomic-based predictions of phenotypes. In particular, we have developed novel predictive models that use information on functional features of the genome that enables more accurate predictions of complex trait phenotypes. We illustrates core facilities of the qgg package by analysing human standing height from the UK Biobank. Availability and implementation The R package qgg is freely available. For latest updates, user guides and example scripts, consult the main page http://psoerensen.github.io/qgg/.

[1]  J. A. Wise,et al.  Influence of β-alanine supplementation on skeletal muscle carnosine concentrations and high intensity cycling capacity , 2007, Amino Acids.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  P. Ma,et al.  Exploring the genetic architecture and improving genomic prediction accuracy for mastitis and milk production traits in dairy cattle by mapping variants to hepatic transcriptomic regions responsive to intra-mammary infection , 2017, Genetics Selection Evolution.

[4]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[5]  Stefan M. Edwards,et al.  Multiple Trait Covariance Association Test Identifies Gene Ontology Categories Associated with Chill Coma Recovery Time in Drosophila melanogaster , 2017, Scientific Reports.

[6]  P. Sørensen,et al.  Covariance Association Test (CVAT) Identifies Genetic Markers Associated with Schizophrenia in Functionally Associated Biological Processes , 2016, Genetics.

[7]  Strong impact of thermal environment on the quantitative genetic basis of a key stress tolerance trait , 2018, Heredity.

[8]  M. Lund,et al.  Integrating Sequence-based GWAS and RNA-Seq Provides Novel Insights into the Genetic Basis of Mastitis and Milk Production in Dairy Cattle , 2017, Scientific Reports.

[9]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[10]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[11]  P. Donnelly,et al.  Genome-wide genetic data on ~500,000 UK Biobank participants , 2017, bioRxiv.

[12]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[13]  Stefan M. Edwards,et al.  Genomic Prediction for Quantitative Traits Is Improved by Mapping Variants to Gene Ontology Categories in Drosophila melanogaster , 2016, Genetics.

[14]  P. Sørensen,et al.  Environmental variation partitioned into separate heritable components , 2018, Evolution; international journal of organic evolution.

[15]  T. Mackay,et al.  Functional Validation of Candidate Genes Detected by Genomic Feature Models , 2018, G3: Genes, Genomes, Genetics.

[16]  P. Sørensen,et al.  Increased prediction accuracy using a genomic feature model including prior information on quantitative trait locus regions in purebred Danish Duroc pigs , 2016, BMC Genetics.

[17]  D. Pomp,et al.  Decomposing genomic variance using information from GWA, GWE and eQTL analysis. , 2016, Animal genetics.

[18]  Bryn E. Gaertner,et al.  Genomic Analysis of Genotype-by-Social Environment Interaction for Drosophila melanogaster Aggressive Behavior , 2017, Genetics.

[19]  Ying Yu,et al.  MicroRNA-guided prioritization of genome-wide association signals reveals the importance of microRNA-target gene networks for complex traits in cattle , 2018, Scientific Reports.

[20]  Stefan M. Edwards,et al.  Partitioning of genomic variance reveals biological pathways associated with udder health and milk production traits in dairy cattle , 2015, Genetics Selection Evolution.

[21]  P. Sørensen,et al.  A Quantitative Genomic Approach for Analysis of Fitness and Stress Related Traits in a Drosophila melanogaster Model Population , 2016, International journal of genomics.