Pathway crosstalk effects: shrinkage and disentanglement using a Bayesian hierarchical model

Identifying the biological pathways that are related to various clinical phenotypes is an important concern in biomedical research. Based on estimated expression levels and/or p values, overrepresentation analysis (ORA) methods provide rankings of pathways, but they are tainted because pathways overlap. This crosstalk phenomenon has not been rigorously studied and classical ORA does not take into consideration: (1) that crosstalk effects in cases of overlapping pathways can cause incorrect rankings of pathways, (2) that crosstalk effects can cause both excess type I errors and type II errors, (3) that rankings of small pathways are unreliable, and (4) that type I error rates can be inflated due to multiple comparisons of pathways. We develop a Bayesian hierarchical model that addresses these problems, providing sensible estimates and rankings, and reducing error rates. We show, on both real and simulated data, that the results of our method are more accurate than the results produced by the classical overrepresentation analysis, providing a better understanding of the underlying biological phenomena involved in the phenotypes under study. The R code and the binary datasets for implementing the analyses described in this article are available online at: http://www.eng.wayne.edu/page.php?id=6402.

[1]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[2]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[3]  T. Ideker,et al.  A decade of systems biology. , 2010, Annual review of cell and developmental biology.

[4]  Cristina Mitrea,et al.  Methods and approaches in the topology-based analysis of biological pathways , 2013, Front. Physiol..

[5]  Sorin Draghici,et al.  Signaling pathways coupling phenomena , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[6]  A. Schwartzman Comment on "Correlated z-values and the accuracy of large-scale statistical estimates" by Bradley Efron. , 2010, Journal of the American Statistical Association.

[7]  George Michailidis,et al.  Transcriptional and metabolic data integration and modeling for identification of active pathways. , 2012, Biostatistics.

[8]  Yun-Hee Lee,et al.  In vivo identification of bipotential adipocyte progenitors recruited by β3-adrenoceptor activation and high-fat feeding. , 2012, Cell metabolism.

[9]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[10]  S. Deris,et al.  Pathway-Based Microarray Analysis for Defining Statistical Significant Phenotype-Related Pathways: A Review of Common Approaches , 2009, 2009 International Conference on Information Management and Engineering.

[11]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[12]  Daniel E. Ho,et al.  Improving the Presentation and Interpretation of Online Ratings Data with Model-Based Figures , 2008 .

[13]  Jianqing Fan,et al.  Journal of the American Statistical Association Estimating False Discovery Proportion under Arbitrary Covariance Dependence Estimating False Discovery Proportion under Arbitrary Covariance Dependence , 2022 .

[14]  Dawei Liu,et al.  Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models , 2008, BMC Bioinformatics.

[15]  Frank Emmert-Streib,et al.  Pathway Analysis of Expression Data: Deciphering Functional Building Blocks of Complex Diseases , 2011, PLoS Comput. Biol..

[16]  J. Granneman,et al.  Role of hormone-sensitive lipase in beta-adrenergic remodeling of white adipose tissue. , 2007, American journal of physiology. Endocrinology and metabolism.

[17]  Rafael A Irizarry,et al.  Gene set enrichment analysis made simple , 2009, Statistical methods in medical research.

[18]  R. Tibshirani,et al.  On testing the significance of sets of genes , 2006, math/0610667.

[19]  Christian de Duve,et al.  The Lysosome Concept , 2008 .

[20]  J. Granneman,et al.  Metabolic and cellular plasticity in white adipose tissue II: role of peroxisome proliferator-activated receptor-alpha. , 2005, American journal of physiology. Endocrinology and metabolism.

[21]  Peter N. Robinson,et al.  GOing Bayesian: model-based gene set analysis of genome-scale data , 2010, Nucleic acids research.

[22]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[23]  D. Damian,et al.  Statistical concerns about the GSEA procedure , 2004, Nature Genetics.

[24]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[25]  P. Leppert,et al.  Orientation of elastic fibers in the human cervix. , 1986, American journal of obstetrics and gynecology.

[26]  D. Russell,et al.  The parturition defect in steroid 5alpha-reductase type 1 knockout mice is due to impaired cervical ripening. , 1999, Molecular endocrinology.

[27]  Amer A. Beg,et al.  An Essential Role of the NF-κB/Toll-Like Receptor Pathway in Induction of Inflammatory and Tissue-Repair Gene Expression by Necrotic Cells1 , 2001, The Journal of Immunology.

[28]  R. Lempicki,et al.  Evaluation of gene expression measurements from commercial microarray platforms. , 2003, Nucleic acids research.

[29]  S. Drăghici,et al.  Analysis and correction of crosstalk effects in pathway analysis , 2013, Genome research.

[30]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[31]  Yoav Benjamini,et al.  Identifying differentially expressed genes using false discovery rate controlling procedures , 2003, Bioinform..

[32]  J. Granneman,et al.  Metabolic and cellular plasticity in white adipose tissue I: effects of beta3-adrenergic receptor activation. , 2005, American journal of physiology. Endocrinology and metabolism.

[33]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[34]  M. Newton,et al.  Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis , 2007, 0708.4350.

[35]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[36]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[37]  Alexander R. Pico,et al.  Finding the Right Questions: Exploratory Pathway Analysis to Enhance Biological Discovery in Large Datasets , 2010, PLoS biology.

[38]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[39]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[40]  Sylvia Richardson,et al.  Inference and monitoring convergence , 1995 .

[41]  Terence P Speed,et al.  Multiple testing and its applications to microarrays , 2009, Statistical methods in medical research.

[42]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Ian B. Jeffery,et al.  Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data , 2006, BMC Bioinformatics.

[44]  Henning Hermjakob,et al.  The Reactome pathway Knowledgebase , 2015, Nucleic acids research.

[45]  Sorin Draghici,et al.  Down-weighting overlapping genes improves gene set analysis , 2012, BMC Bioinformatics.

[46]  N. Uldbjerg,et al.  Ripening of the human uterine cervix related to changes in collagen, glycosaminoglycans, and collagenolytic activity. , 1983, American journal of obstetrics and gynecology.

[47]  A. Gelman,et al.  Using Redundant Parameterizations to Fit Hierarchical Models , 2008 .

[48]  P. Leppert,et al.  Anatomy and Physiology of Cervical Ripening , 1995, Clinical obstetrics and gynecology.

[49]  Sorin Draghici,et al.  The transcriptome of cervical ripening in human pregnancy before the onset of labor at term: Identification of novel molecular functions involved in this process , 2009, The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstetricians.

[50]  P. Henson,et al.  Phagocytosis of senescent neutrophils by human monocyte-derived macrophages and rabbit inflammatory macrophages , 1982, The Journal of experimental medicine.