Including Phenotypic Causal Networks in Genome-Wide Association Studies Using Mixed Effects Structural Equation Models

Network based statistical models accounting for putative causal relationships among multiple phenotypes can be used to infer single-nucleotide polymorphism (SNP) effect which transmitting through a given causal path in genome-wide association studies (GWAS). In GWAS with multiple phenotypes, reconstructing underlying causal structures among traits and SNPs using a single statistical framework is essential for understanding the entirety of genotype-phenotype maps. A structural equation model (SEM) can be used for such purposes. We applied SEM to GWAS (SEM-GWAS) in chickens, taking into account putative causal relationships among breast meat (BM), body weight (BW), hen-house production (HHP), and SNPs. We assessed the performance of SEM-GWAS by comparing the model results with those obtained from traditional multi-trait association analyses (MTM-GWAS). Three different putative causal path diagrams were inferred from highest posterior density (HPD) intervals of 0.75, 0.85, and 0.95 using the inductive causation algorithm. A positive path coefficient was estimated for BM → BW, and negative values were obtained for BM → HHP and BW → HHP in all implemented scenarios. Further, the application of SEM-GWAS enabled the decomposition of SNP effects into direct, indirect, and total effects, identifying whether a SNP effect is acting directly or indirectly on a given trait. In contrast, MTM-GWAS only captured overall genetic effects on traits, which is equivalent to combining the direct and indirect SNP effects from SEM-GWAS. Although MTM-GWAS and SEM-GWAS use the similar probabilistic models, we provide evidence that SEM-GWAS captures complex relationships in terms of causal meaning and mediation and delivers a more comprehensive understanding of SNP effects compared to MTM-GWAS. Our results showed that SEM-GWAS provides important insight regarding the mechanism by which identified SNPs control traits by partitioning them into direct, indirect, and total SNP effects.

[1]  Xiao-Lin Wu,et al.  Inferring causal phenotype networks using structural equation models , 2011, Genetics Selection Evolution.

[2]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[3]  Daniel Gianola,et al.  Quantitative Genetic Models for Describing Simultaneous and Recursive Relationships Between Phenotypes This article is dedicated to Arthur B. Chapman, teacher and mentor of numerous animal breeding students and disciple and friend of Sewall Wright. , 2004, Genetics.

[4]  Dong Wang,et al.  Regression-Based Multi-Trait QTL Mapping Using a Structural Equation Model , 2010, Statistical applications in genetics and molecular biology.

[5]  R. Yang,et al.  Multiple-trait genome-wide association study based on principal component analysis for residual covariance matrix , 2014, Heredity.

[6]  J. Jamrozik,et al.  Alternative parameterizations of the multiple-trait random regression model for milk yield and somatic cell score via recursive links between phenotypes. , 2011, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[7]  P. VanRaden,et al.  Efficient methods to compute genomic predictions. , 2008, Journal of dairy science.

[8]  G. A. Marcoulides,et al.  A First Course in Structural Equation Modeling , 2000 .

[9]  Eleazar Eskin,et al.  Improved linear mixed models for genome-wide association studies , 2012, Nature Methods.

[10]  Paul H. C. Eilers,et al.  GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies , 2013, BMC Bioinformatics.

[11]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[12]  M Quinton,et al.  Estimation of effects of single genes on quantitative traits. , 1992, Journal of animal science.

[13]  Linda Valeri,et al.  Decomposition of the Total Effect in the Presence of Multiple Mediators and Interactions. , 2018, American journal of epidemiology.

[14]  Karin Meyer,et al.  WOMBAT—A tool for mixed model analyses in quantitative genetics by restricted maximum likelihood (REML) , 2007, Journal of Zhejiang University SCIENCE B.

[15]  Baolin Wu,et al.  Genome-wide association test of multiple continuous traits using imputed SNPs. , 2017, Statistics and its interface.

[16]  K. Meyer,et al.  “SNP Snappy”: A Strategy for Fast Genome-Wide Association Studies Fitting a Full Mixed Model , 2012, Genetics.

[17]  H. Grüneberg,et al.  Introduction to quantitative genetics , 1960 .

[18]  M. Boustani,et al.  New aQTL SNPs for the CYP2D6 Identified by a Novel Mediation Analysis of Genome-Wide SNP Arrays, Gene Expression Arrays, and CYP2D6 Activity , 2013, BioMed research international.

[19]  Robin Thompson,et al.  Analysis of Litter Size and Average Litter Weight in Pigs Using a Recursive Model , 2007, Genetics.

[20]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[21]  Bjarni J. Vilhjálmsson,et al.  A mixed-model approach for genome-wide association studies of correlated traits in structured populations , 2012, Nature Genetics.

[22]  Yaodong Hu,et al.  The identification of 14 new genes for meat quality traits in chicken using a genome-wide association study , 2013, BMC Genomics.

[23]  M. Rothschild,et al.  Identification of quantitative trait loci for body temperature, body weight, breast yield, and digestibility in an advanced intercross line of chickens under heat stress , 2015, Genetics Selection Evolution.

[24]  Xiao-Lin Wu,et al.  Is Structural Equation Modeling Advantageous for the Genetic Improvement of Multiple Traits? , 2013, Genetics.

[25]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[26]  D. Gianola,et al.  A predictive assessment of genetic correlations between traits in chickens using markers , 2017, Genetics Selection Evolution.

[27]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[28]  Keith Shockley,et al.  Structural Model Analysis of Multiple Quantitative Traits , 2006, PLoS genetics.

[29]  NL Nock,et al.  Evaluating aggregate effects of rare and common variants in the 1000 Genomes Project exon sequencing data using latent variable structural equation modeling , 2011, BMC proceedings.

[30]  Justin O Borevitz,et al.  Genome-wide association studies in plants: the missing heritability is in the field , 2011, Genome Biology.

[31]  M. Goddard,et al.  Genome-wide association and genomic selection in animal breeding. , 2010, Genome.

[32]  Guilherme J M Rosa,et al.  Searching for Recursive Causal Structures in Multivariate Quantitative Genetics Mixed Models , 2010, Genetics.

[33]  E. Schadt Reconstructing Causal Network Models of Human Disease , 2016 .

[34]  B. Browning,et al.  Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. , 2007, American journal of human genetics.

[35]  F. V. van Eeuwijk,et al.  A New Method to Infer Causal Phenotype Networks Using QTL and Phenotypic Information , 2014, PloS one.

[36]  Dong Wang,et al.  Bayesian mixture structural equation modelling in multiple-trait QTL mapping. , 2010, Genetics research.

[37]  P. O’Reilly,et al.  MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS , 2012, PloS one.

[38]  L. Kiemeney,et al.  A Comparison of Multivariate Genome-Wide Association Methods , 2014, PloS one.

[39]  J. Pearl Causal inference in statistics: An overview , 2009 .

[40]  D. Gianola,et al.  Bayesian structural equation models for inferring relationships between phenotypes: a review of methodology, identifiability, and applications. , 2010, Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie.

[41]  Richard T. Barfield,et al.  Testing for the indirect effect under the null for genome‐wide mediation analyses , 2017, Genetic epidemiology.

[42]  Ignacy Misztal,et al.  Different genomic relationship matrices for single-step analysis using phenotypic, pedigree and genomic information , 2011, Genetics Selection Evolution.

[43]  James C. Anderson,et al.  STRUCTURAL EQUATION MODELING IN PRACTICE: A REVIEW AND RECOMMENDED TWO-STEP APPROACH , 1988 .

[44]  R. L. Quaas,et al.  Multiple Trait Evaluation Using Relatives' Records , 1976 .

[45]  J. Woolliams,et al.  Edinburgh Research Explorer Development of a high density 600K SNP genotyping array for chicken , 2022 .

[46]  R. Frankham Introduction to quantitative genetics (4th edn): by Douglas S. Falconer and Trudy F.C. Mackay Longman, 1996. £24.99 pbk (xv and 464 pages) ISBN 0582 24302 5 , 1996 .

[47]  D. Gianola,et al.  Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat and Arabidopsis , 2016, G3: Genes, Genomes, Genetics.

[48]  W. Muir,et al.  Genome-wide association mapping including phenotypes from relatives without genotypes. , 2012, Genetics research.