Integrative Genomics with Mediation Analysis in a Survival Context

DNA copy number aberrations (DCNA) and subsequent altered gene expression profiles may have a major impact on tumor initiation, on development, and eventually on recurrence and cancer-specific mortality. However, most methods employed in integrative genomic analysis of the two biological levels, DNA and RNA, do not consider survival time. In the present note, we propose the adoption of a survival analysis-based framework for the integrative analysis of DCNA and mRNA levels to reveal their implication on patient clinical outcome with the prerequisite that the effect of DCNA on survival is mediated by mRNA levels. The specific aim of the paper is to offer a feasible framework to test the DCNA-mRNA-survival pathway. We provide statistical inference algorithms for mediation based on asymptotic results. Furthermore, we illustrate the applicability of the method in an integrative genomic analysis setting by using a breast cancer data set consisting of 141 invasive breast tumors. In addition, we provide implementation in R.

[1]  Thomas H Scheike,et al.  Flexible survival regression modelling , 2010, Statistical methods in medical research.

[2]  Eric J Tchetgen Tchetgen,et al.  On Causal Mediation Analysis with a Survival Outcome , 2011, The international journal of biostatistics.

[3]  Bonnie Berger,et al.  Assessing statistical significance in causal graphs , 2012, BMC Bioinformatics.

[4]  D. Mackinnon,et al.  A Simulation Study of Mediated Effect Measures. , 1995, Multivariate behavioral research.

[5]  Johan Staaf,et al.  High‐resolution genomic profiles of breast cancer cell lines assessed by tiling BAC array comparative genomic hybridization , 2007, Genes, chromosomes & cancer.

[6]  Dennis E. Jennings How Do We Judge Confidence-Interval Adequacy? , 1987 .

[7]  Naoki Abe,et al.  Grouped graphical Granger modeling for gene expression regulatory networks discovery , 2009, Bioinform..

[8]  Theis Lange,et al.  Direct and Indirect Effects in a Survival Context , 2011, Epidemiology.

[9]  S. West,et al.  A comparison of methods to test mediation and other intervening variable effects. , 2002, Psychological methods.

[10]  Kristopher J Preacher,et al.  Addressing Moderated Mediation Hypotheses: Theory, Methods, and Prescriptions , 2007, Multivariate behavioral research.

[11]  Taesung Park,et al.  An integrated approach to infer causal associations among gene expression, genotype variation, and disease. , 2009, Genomics.

[12]  O. Aalen,et al.  Dynamic path analysis—a new approach to analyzing time-dependent covariates , 2006, Lifetime data analysis.

[13]  M. Sobel Asymptotic Confidence Intervals for Indirect Effects in Structural Equation Models , 1982 .

[14]  Johannes Textor,et al.  DAGitty: a graphical tool for analyzing causal diagrams. , 2011, Epidemiology.

[15]  Stijn Vansteelandt,et al.  Estimation of direct effects for survival data by using the Aalen additive hazards model , 2011 .

[16]  Patrick Royston,et al.  The design of simulation studies in medical statistics , 2006, Statistics in medicine.

[17]  T. Martinussen,et al.  Dynamic path analysis for event time data: large sample properties and inference , 2010, Lifetime data analysis.

[18]  Yang Li,et al.  Critical reasoning on causal inference in genome-wide linkage and association studies. , 2010, Trends in genetics : TIG.

[19]  M. Springer,et al.  The Distribution of Products of Independent Random Variables , 1966 .

[20]  G. Oehlert A note on the delta method , 1992 .

[21]  Lawrence Leemis,et al.  Computing the distribution of the product of two continuous random variables , 2004, Comput. Stat. Data Anal..

[22]  Martin Schäfer,et al.  Integrated analysis of copy number alterations and gene expression: a bivariate assessment of equally directed abnormalities , 2009, Bioinform..

[23]  Wessel N. van Wieringen,et al.  Matching of array CGH and gene expression microarray features for the purpose of integrative genomic analyses , 2011, BMC Bioinformatics.

[24]  A. Frigessi,et al.  Indirect genomic effects on survival from gene expression data , 2008, Genome Biology.

[25]  L. Chin,et al.  Making sense of cancer genomic data. , 2011, Genes & development.

[26]  David P Mackinnon,et al.  Confidence Limits for the Indirect Effect: Distribution of the Product and Resampling Methods , 2004, Multivariate behavioral research.

[27]  C. Begg,et al.  Testing Clonal Relatedness of Tumors Using Array Comparative Genomic Hybridization: A Statistical Challenge , 2010, Clinical Cancer Research.

[28]  David P MacKinnon,et al.  RMediation: An R package for mediation analysis confidence intervals , 2011, Behavior research methods.

[29]  Z. A. Lomnicki On the Distribution of Products of Random Variables , 1967 .

[30]  Yang Xie,et al.  Statistical methods for integrating multiple types of high-throughput data. , 2010, Methods in molecular biology.

[31]  Peter J. Park,et al.  Integrative analysis reveals the direct and indirect interactions between DNA copy number aberrations and gene expression changes , 2008, Bioinform..

[32]  Matthew S. Fritz,et al.  PSYCHOLOGICAL SCIENCE Research Article Required Sample Size to Detect the Mediated Effect , 2022 .

[33]  Torbjörn E. M. Nordling,et al.  Network modeling of the transcriptional effects of copy number aberrations in glioblastoma , 2011, Molecular systems biology.

[34]  Hiroko K. Solvang,et al.  Linear and non-linear dependencies between copy number aberrations and mRNA expression reveal distinct molecular pathways in breast cancer , 2011, BMC Bioinformatics.

[35]  S. Fox,et al.  Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers , 2009, Nature Medicine.

[36]  C. Begg,et al.  Evaluation of the clonal origin of multiple primary melanomas using molecular profiling. , 2009, The Journal of investigative dermatology.

[37]  Anne-Laure Boulesteix,et al.  Over-optimism in bioinformatics: an illustration , 2010, Bioinform..

[38]  O. Aalen,et al.  Further results on the non-parametric linear regression model in survival analysis. , 1993, Statistics in medicine.

[39]  David P Mackinnon,et al.  Covariances between regression coefficient estimates in a single mediator model. , 2009, The British journal of mathematical and statistical psychology.

[40]  Gunnar Steineck,et al.  Segmented regression, a versatile tool to analyze mRNA levels in relation to DNA copy number aberrations , 2012, Genes, chromosomes & cancer.

[41]  Kristopher J Preacher,et al.  Effect size measures for mediation models: quantitative strategies for communicating indirect effects. , 2011, Psychological methods.

[42]  Charlotte Soneson,et al.  Integrative analysis of gene expression and copy number alterations using canonical correlation analysis , 2010, BMC Bioinformatics.

[43]  C. Craig On the Frequency Function of $xy$ , 1936 .

[44]  Per Karlsson,et al.  Clinical Implications of Gene Dosage and Gene Expression Patterns in Diploid Breast Carcinoma , 2010, Clinical Cancer Research.

[45]  Ji Zhu,et al.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. , 2008, The annals of applied statistics.

[46]  Tyler J VanderWeele,et al.  Causal Mediation Analysis With Survival Data , 2011, Epidemiology.

[47]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.