eQTLMAPT: Fast and Accurate eQTL Mediation Analysis With Efficient Permutation Testing Approaches

Expression quantitative trait locus (eQTL) analyses are critical in understanding the complex functional regulatory natures of genetic variation and have been widely used in the interpretation of disease-associated variants identified by genome-wide association studies (GWAS). Emerging evidence has shown that trans-eQTL effects on remote gene expression could be mediated by local transcripts, which is known as the mediation effects. To discover the genome-wide eQTL mediation effects combing genomic and transcriptomic profiles, it is necessary to develop novel computational methods to rapidly scan large number of candidate associations while controlling for multiple testing appropriately. Here, we present eQTLMAPT, an R package aiming to perform eQTL mediation analysis with implementation of efficient permutation procedures in multiple testing correction. eQTLMAPT is advantageous in threefold. First, it accelerates mediation analysis by effectively pruning the permutation process through adaptive permutation scheme. Second, it can efficiently and accurately estimate the significance level of mediation effects by modeling the null distribution with generalized Pareto distribution (GPD) trained from a few permutation statistics. Third, eQTLMAPT provides flexible interfaces for users to combine various permutation schemes with different confounding adjustment methods. Experiments on real eQTL dataset demonstrate that eQTLMAPT provides higher resolution of estimated significance of mediation effects and is an order of magnitude faster than compared methods with similar accuracy.

[1]  Xiaoyu Wang,et al.  Combining gene ontology with deep neural networks to enhance the clustering of single cell RNA-Seq data , 2018, BMC Bioinformatics.

[2]  Jiajie Peng,et al.  InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk , 2018, BMC Genomics.

[3]  Andrew E. Jaffe,et al.  Bioinformatics Applications Note Gene Expression the Sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments , 2022 .

[4]  P. Sham,et al.  Model-Free Analysis and Permutation Tests for Allelic Associations , 1999, Human Heredity.

[5]  D. Reich,et al.  Population Structure and Eigenanalysis , 2006, PLoS genetics.

[6]  Liang Cheng,et al.  Exposing the Causal Effect of C-Reactive Protein on the Risk of Type 2 Diabetes Mellitus: A Mendelian Randomization Study , 2018, Front. Genet..

[7]  Marcel J. T. Reinders,et al.  Fewer permutations, more accurate P-values , 2009, Bioinform..

[8]  Jie Sun,et al.  DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function , 2018, Bioinform..

[9]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[10]  Daniel Marbach,et al.  Tissue-specific regulatory circuits reveal variable modular perturbations across complex diseases , 2016, Nature Methods.

[11]  Liang Cheng,et al.  gutMDisorder: a comprehensive database for dysbiosis of the gut microbiota in disorders and interventions , 2019, Nucleic acids research.

[12]  Andrey A. Shabalin,et al.  Matrix eQTL: ultra fast eQTL analysis via large matrix operations , 2011, Bioinform..

[13]  Erdogan Taskesen,et al.  Functional mapping and annotation of genetic associations with FUMA , 2017, Nature Communications.

[14]  H. Abdi,et al.  Principal component analysis , 2010 .

[15]  Vartan Choulakian,et al.  Goodness-of-Fit Tests for the Generalized Pareto Distribution , 2001, Technometrics.

[16]  P. Visscher,et al.  10 Years of GWAS Discovery: Biology, Function, and Translation. , 2017, American journal of human genetics.

[17]  Robert Bjornson,et al.  Large-Scale trans-eQTLs Affect Hundreds of Transcripts and Mediate Patterns of Transcriptional Co-regulation. , 2017, American journal of human genetics.

[18]  Christopher D. Brown,et al.  Identification, Replication, and Functional Fine-Mapping of Expression Quantitative Trait Loci in Primary Human Liver Tissue , 2011, PLoS genetics.

[19]  J. Schneider,et al.  Overview and findings from the religious orders study. , 2012, Current Alzheimer research.

[20]  Hans-Ulrich Klein,et al.  Descriptor : A multi-omic atlas of the human frontal cortex for aging and Alzheimer ’ s disease research , 2018 .

[21]  Qinghua Guo,et al.  LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse , 2018, Nucleic Acids Res..

[22]  A. Chen-Plotkin,et al.  The Post-GWAS Era: From Association to Function. , 2018, American journal of human genetics.

[23]  Jiajie Peng,et al.  Identifying emerging phenomenon in long temporal phenotyping experiments , 2019, Bioinform..

[24]  Stephen B. Montgomery,et al.  Cis and Trans Effects of Human Genomic Variants on Gene Expression , 2014, PLoS genetics.

[25]  J. Schneider,et al.  Overview and findings from the rush Memory and Aging Project. , 2012, Current Alzheimer research.

[26]  AbdiHervé,et al.  Principal Component Analysis , 2010, Essentials of Pattern Recognition.

[27]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  Meng Zhou,et al.  MetSigDis: a manually curated resource for the metabolic signatures of diseases , 2019, Briefings Bioinform..

[29]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[30]  Charles C. White,et al.  A molecular network of the aging human brain provides insights into the pathology and cognitive decline of Alzheimer’s disease , 2018, Nature Neuroscience.

[31]  R. Durbin,et al.  Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses , 2012, Nature Protocols.

[32]  Yadong Wang,et al.  FSM: Fast and Scalable Network Motif Discovery for Exploring Higher-order Network Organizations. , 2020, Methods.

[33]  Liang Cheng,et al.  Human Disease System Biology. , 2018, Current gene therapy.

[34]  E. Dermitzakis,et al.  Expression quantitative trait loci: present and future , 2013, Philosophical Transactions of the Royal Society B: Biological Sciences.

[35]  Jianye Hao,et al.  A learning-based framework for miRNA-disease association identification using neural networks , 2018, bioRxiv.

[36]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[37]  Jeroen F. J. Laros,et al.  Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories , 2013, Nature Biotechnology.

[38]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[39]  Xiaoyu Wang,et al.  Combining Gene Ontology with Deep Neural Networks to Enhance the Clustering of Single Cell RNA-Seq Data , 2018 .

[40]  Fan Yang,et al.  Identifying cis-mediators for trans-eQTLs across many human tissues using genomic mediation analysis , 2016, bioRxiv.

[41]  Lude Franke,et al.  Mediation Analysis Demonstrates That Trans-eQTLs Are Often Explained by Cis-Mediation: A Genome-Wide Analysis among 1,800 South Asians , 2014, PLoS genetics.

[42]  David E Hill,et al.  Dynamic Role of trans Regulation of Gene Expression in Relation to Complex Traits. , 2017, American journal of human genetics.

[43]  Jingyuan Fu,et al.  Trans-eQTLs Reveal That Independent Genetic Variants Associated with a Complex Phenotype Converge on Intermediate Genes, with a Major Role for the HLA , 2011, PLoS genetics.

[44]  Emmanouil T. Dermitzakis,et al.  Fast and efficient QTL mapper for thousands of molecular phenotypes , 2015, bioRxiv.

[45]  Wanying Xu,et al.  OAHG: an integrated resource for annotating human genes with multi-level ontologies , 2016, Scientific Reports.

[46]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[47]  Hui Hu,et al.  AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors , 2018, Nucleic Acids Res..

[48]  Jiajie Peng,et al.  Predicting Parkinson's Disease Genes Based on Node2vec and Autoencoder , 2019, Front. Genet..