MODELING DEPENDENT GENE EXPRESSION.

In this paper we propose a Bayesian approach for inference about dependence of high throughput gene expression. Our goals are to use prior knowledge about pathways to anchor inference about dependence among genes; to account for this dependence while making inferences about differences in mean expression across phenotypes; and to explore differences in the dependence itself across phenotypes. Useful features of the proposed approach are a model-based parsimonious representation of expression as an ordinal outcome, a novel and flexible representation of prior information on the nature of dependencies, and the use of a coherent probability model over both the structure and strength of the dependencies of interest. We evaluate our approach through simulations and in the analysis of data on expression of genes in the Complement and Coagulation Cascade pathway in ovarian cancer.

[1]  M. Drton,et al.  Multiple Testing and Error Control in Gaussian Graphical Model Selection , 2005, math/0508267.

[2]  Ralph S Freedman,et al.  Ovarian cancer, the coagulation pathway, and inflammation , 2005, Journal of Translational Medicine.

[3]  P. Terranova,et al.  Review: Cytokine Involvement in Ovarian Processes , 1997, American journal of reproductive immunology.

[4]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[5]  J. Lambris,et al.  Unwelcome complement. , 2009, Cancer research.

[6]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[7]  P. Green,et al.  Decomposable graphical Gaussian model determination , 1999 .

[8]  J. Koster,et al.  Markov properties of nonrecursive causal models , 1996 .

[9]  Gert Sabidussi,et al.  The centrality index of a graph , 1966 .

[10]  Sach Mukherjee,et al.  Network inference using informative priors , 2008, Proceedings of the National Academy of Sciences.

[11]  Hongzhe Li,et al.  A Markov random field model for network-based analysis of genomic data , 2007, Bioinform..

[12]  James G. Scott,et al.  An exploration of aspects of Bayesian multiple testing , 2006 .

[13]  G. Parmigiani,et al.  A nested unsupervised approach to identifying novel molecular subtypes , 2004 .

[14]  Kevin Murphy,et al.  Modelling Gene Expression Data using Dynamic Bayesian Networks , 2006 .

[15]  Erwin G. Van Meir,et al.  The role of interleukin-8 and its receptors in gliomagenesis and tumoral angiogenesis. , 2005, Neuro-oncology.

[16]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[17]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[18]  Paul Pavlidis,et al.  Gene Ontology term overlap as a measure of gene functional similarity , 2008, BMC Bioinformatics.

[19]  James G. Scott,et al.  Feature-Inclusion Stochastic Search for Gaussian Graphical Models , 2008 .

[20]  P. Sebastiani,et al.  Normative selection of Bayesian networks , 2005 .

[21]  Holger Fröhlich,et al.  GOSim – an R-package for computation of information theoretic GO similarities between terms and gene products , 2007, BMC Bioinformatics.

[22]  David Page,et al.  Modelling regulatory pathways in E. coli from time series expression profiles , 2002, ISMB.

[23]  Gerd Ronning,et al.  Efficient Estimation of Ordered Probit Models , 1996 .

[24]  Michael A. West,et al.  Archival Version including Appendicies : Experiments in Stochastic Computation for High-Dimensional Graphical Models , 2005 .

[25]  James G. Scott,et al.  Objective Bayesian model selection in Gaussian graphical models , 2009 .

[26]  G. Parmigiani,et al.  A statistical framework for expression‐based molecular classification in cancer , 2002 .

[27]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[28]  J. Petrik,et al.  Thrombin Generation and Presence of Thrombin Receptor in Ovarian Follicles1 , 2002, Biology of reproduction.

[29]  P. Spirtes,et al.  Using Path Diagrams as a Structural Equation Modeling Tool , 1998 .

[30]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[31]  Zoubin Ghahramani,et al.  A Bayesian approach to reconstructing genetic regulatory networks with hidden factors , 2005, Bioinform..

[32]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[33]  Sylvia Richardson,et al.  Detection of gene copy number changes in CGH microarrays using a spatially correlated mixture model , 2006, Bioinform..

[34]  T. Fearn,et al.  Multivariate Bayesian variable selection and prediction , 1998 .

[35]  Rosemary Braun,et al.  Identifying differential correlation in gene/pathway combinations , 2008, BMC Bioinformatics.

[36]  Eric D. Kolaczyk,et al.  Statistical Analysis of Network Data: Methods and Models , 2009 .

[37]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[38]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[39]  A. Dawid,et al.  Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[40]  Hongzhe Li,et al.  A hidden spatial-temporal Markov random field model for network-based analysis of time course gene expression data , 2008, 0803.3942.