Modular network construction using eQTL data: an analysis of computational costs and benefits

Background: In this paper, we consider analytic methods for the integrated analysis of genomic DNA variation and mRNA expression (also named as eQTL data), to discover genetic networks that are associated with a complex trait of interest. Our focus is the systematic evaluation of the trade-off between network size and network search efficiency in the construction of these networks. Results: We developed a modular approach to network construction, building from smaller networks to larger ones, thereby reducing the search space while including more variables in the analysis. The goal is achieving a lower computational cost while maintaining high confidence in the resulting networks. As demonstrated in our simulation results, networks built in this way have low node/edge false discovery rate (FDR) and high edge sensitivity comparing to greedy search. We further demonstrate our method in a data set of cellular responses to two chemotherapeutic agents: docetaxel and 5-fluorouracil (5-FU), and identify biologically plausible networks that might describe resistances to these drugs. Conclusion: In this study, we suggest that guided comprehensive searches for parsimonious networks should be considered as an alternative to greedy network searches.

[1]  Yan Cui,et al.  Inferring gene transcriptional modulatory relations: a genetical genomics approach. , 2005, Human molecular genetics.

[2]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[3]  Aldi Kraja,et al.  Genome-wide discovery of loci influencing chemotherapy cytotoxicity. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Daphne Koller,et al.  Ordering-Based Search: A Simple and Effective Algorithm for Learning Bayesian Networks , 2005, UAI.

[5]  Xia Yang,et al.  Validation of Candidate Causal Genes for Abdominal Obesity Which Affect Shared Metabolic Pathways and Networks , 2009, Nature Genetics.

[6]  E E Schadt,et al.  Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits , 2005, Nature Genetics.

[7]  Claus Dethlefsen,et al.  deal: A Package for Learning Bayesian Networks , 2003 .

[8]  Roy S Herbst,et al.  Mode of action of docetaxel - a basis for combination with novel anticancer agents. , 2003, Cancer treatment reviews.

[9]  Nicola J. Rinaldi,et al.  Computational discovery of gene modules and regulatory networks , 2003, Nature Biotechnology.

[10]  Li Wang,et al.  An integrative approach for causal gene identification and gene regulatory pathway inference , 2006, ISMB.

[11]  Satoru Miyano,et al.  Bayesian Network and Nonparametric Heteroscedastic Regression for Nonlinear Modeling of Genetic Network , 2003, J. Bioinform. Comput. Biol..

[12]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[13]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[14]  S. Horvath,et al.  Statistical Applications in Genetics and Molecular Biology , 2011 .

[15]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[16]  Joshua M. Stuart,et al.  Integrating genotype and phenotype information: an overview of the PharmGKB project , 2001, The Pharmacogenomics Journal.

[17]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[18]  Julia Lasserre,et al.  Finding Associations among Histone Modifications Using Sparse Partial Correlation Networks , 2013, PLoS Comput. Biol..

[19]  Ido D. Weiss,et al.  Interaction between CXCR4 and CCL20 Pathways Regulates Tumor Growth , 2009, PloS one.

[20]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[21]  R. Spielman,et al.  Natural variation in human gene expression assessed in lymphoblastoid cells , 2003, Nature Genetics.

[22]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[23]  J R O'Connell,et al.  PedCheck: a program for identification of genotype incompatibilities in linkage analysis. , 1998, American journal of human genetics.

[24]  Jun Zhu,et al.  Increasing the Power to Detect Causal Associations by Combining Genotypic and Expression Data in Segregating Populations , 2007, PLoS Comput. Biol..

[25]  S. Horvath,et al.  Variations in DNA elucidate molecular networks that cause disease , 2008, Nature.

[26]  Susanne Bottcher,et al.  Learning Bayesian networks with mixed variables , 2001, AISTATS.

[27]  Kai-Ping Chang,et al.  Macrophage Inflammatory Protein-3α Is a Novel Serum Marker for Nasopharyngeal Carcinoma Detection and Prediction of Treatment Outcomes , 2008, Clinical Cancer Research.

[28]  Xiaojiang Xu,et al.  Learning module networks from genome‐wide location and expression data , 2004, FEBS letters.

[29]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[30]  V. Anne Smith,et al.  Using Bayesian Network Inference Algorithms to Recover Molecular Genetic Regulatory Networks , 2002 .

[31]  Kevin M. Ryan,et al.  Mechanistic and Predictive Profiling of 5-Fluorouracil Resistance in Human Cancer Cells , 2004, Cancer Research.

[32]  Haytham Elghazel,et al.  An Experimental Comparison of Hybrid Algorithms for Bayesian Network Structure Learning , 2012, ECML/PKDD.

[33]  Hsun-Hsien Chang,et al.  Phenotype prediction by integrative network analysis of SNP and gene expression microarrays , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[34]  Eric E. Schadt,et al.  Integrating genotypic and expression data in a segregating mouse population to identify 5-lipoxygenase as a susceptibility gene for obesity and bone traits , 2005 .

[35]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.