Inferring combinatorial association logic networks in multimodal genome-wide screens

Motivation: We propose an efficient method to infer combinatorial association logic networks from multiple genome-wide measurements from the same sample. We demonstrate our method on a genetical genomics dataset, in which we search for Boolean combinations of multiple genetic loci that associate with transcript levels. Results: Our method provably finds the global solution and is very efficient with runtimes of up to four orders of magnitude faster than the exhaustive search. This enables permutation procedures for determining accurate false positive rates and allows selection of the most parsimonious model. When applied to transcript levels measured in myeloid cells from 24 genotyped recombinant inbred mouse strains, we discovered that nine gene clusters are putatively modulated by a logical combination of trait loci rather than a single locus. A literature survey supports and further elucidates one of these findings. Due to our approach, optimal solutions for multi-locus logic models and accurate estimates of the associated false discovery rates become feasible. Our algorithm, therefore, offers a valuable alternative to approaches employing complex, albeit suboptimal optimization strategies to identify complex models. Availability: The MATLAB code of the prototype implementation is available on: http://bioinformatics.tudelft.nl/ or http://bioinformatics.nki.nl/ Contact: m.j.t.reinders@tudelft.nl; l.wessels@nki.nl

[1]  Brian S. Yandell,et al.  A Model Selection Approach for the Identification of Quantitative Trait Loci in Experimental Crosses, Allowing Epistasis , 2002, Genetics.

[2]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[3]  Terence P. Speed,et al.  Sparse combinatorial inference with an application in cancer biology , 2009, Bioinform..

[4]  Sverker Holmgren,et al.  Simultaneous search for multiple QTL using the global optimization algorithm DIRECT , 2004, Bioinform..

[5]  H. Katz,et al.  Inhibition of pathologic inflammation by leukocyte Ig‐like receptor B4 and related inhibitory receptors , 2007, Immunological reviews.

[6]  David M. Evans,et al.  Two-Stage Two-Locus Models in Genome-Wide Association , 2006, PLoS genetics.

[7]  Robert W. Williams,et al.  A new set of BXD recombinant inbred lines from advanced intercross populations in mice , 2004, BMC Genetics.

[8]  M. Lacouture,et al.  gp49B1-alpha(v)beta3 interaction inhibits antigen-induced mast cell activation. , 2001, Nature immunology.

[9]  Rainer Breitling,et al.  Expression Quantitative Trait Loci Are Highly Sensitive to Cellular Differentiation State , 2009, PLoS genetics.

[10]  N. Schork,et al.  Who's afraid of epistasis? , 1996, Nature Genetics.

[11]  David A Calderwood,et al.  Integrin β cytoplasmic domain interactions with phosphotyrosine-binding domains: A structural prototype for diversity in integrin signaling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Nachol Chaiyaratana,et al.  Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses , 2009, BMC Bioinformatics.

[13]  W. Gerald,et al.  A Genome-Wide Screen for Promoter Methylation in Lung Cancer Identifies Novel Methylation Markers for Multiple Malignancies , 2006, PLoS medicine.

[14]  A. Beyer,et al.  Detection and interpretation of expression quantitative trait loci (eQTL). , 2009, Methods.

[15]  Ingo Wegener,et al.  Detecting high-order interactions of single nucleotide polymorphisms using genetic programming , 2007, Bioinform..

[16]  Ingo Ruczinski,et al.  Identifying interacting SNPs using Monte Carlo logic regression , 2005, Genetic epidemiology.

[17]  Christian A. Rees,et al.  Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Karl W. Broman,et al.  A model selection approach for the identification of quantitative trait loci in experimental crosses , 2002 .

[19]  W. Cleveland Robust Locally Weighted Regression and Smoothing Scatterplots , 1979 .

[20]  J. Nap,et al.  Genetical genomics : the added value from segregation , 2001 .

[21]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[22]  T. Südhof,et al.  Regulation of APP-Dependent Transcription Complexes by Mint/X11s: Differential Functions of Mint Isoforms , 2002, The Journal of Neuroscience.

[23]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[24]  Jun S. Liu,et al.  Bayesian inference of epistatic interactions in case-control studies , 2007, Nature Genetics.

[25]  Harald Steiner,et al.  PEN-2 Is an Integral Component of the γ-Secretase Complex Required for Coordinated Expression of Presenilin and Nicastrin* , 2002, The Journal of Biological Chemistry.

[26]  A. Visel,et al.  ChIP-seq accurately predicts tissue-specific activity of enhancers , 2009, Nature.