Hybrid Method Inference for the Construction of Cooperative Regulatory Network in Human

Reconstruction of large scale gene regulatory networks (GRNs in the following) is an important step for understanding the complex regulatory mechanisms within the cell. Many modeling approaches have been introduced to find the causal relationship between genes using expression data. However, they have been suffering from high dimensionality-large number of genes but a small number of samples, overfitting, heavy computation time and low interpretability. We have previously proposed an original Data Mining algorithm Licorn, that infers cooperative regulation network from expression datasets. In this work, we present an extension of Licorn to a hybrid inference method h-Licorn that uses search in both discrete and real valued spaces. Licorn's algorithm, using the discrete space to find cooperative regulation relationships fitting the target gene expression, has been shown to be powerful in identifying cooperative regulation relationships that are out of the scope of most GRN inference methods. Still, as many of related GRN inference techniques, Licorn suffers from a large number of false positives. We propose here an extension of Licorn with a numerical selection step, expressed as a linear regression problem, that effectively complements the discrete search of Licorn. We evaluate a bootstrapped version of h-Licorn on the in silico Dream5 dataset and show that h-Licorn has significantly higher performance than Licorn, and is competitive or outperforms state of the art GRN inference algorithms, especially when operating on small data sets. We also applied h-Licorn on a real dataset of human bladder cancer and show that it performs better than other methods in finding candidate regulatory interactions. In particular, solely based on gene expression data, h-Licorn is able to identify experimentally validated regulator cooperative relationships involved in cancer.

[1]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[2]  Céline Rouveirol,et al.  Identification of functional modules based on transcriptional regulation structure , 2008, BMC proceedings.

[3]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[4]  Jean-Philippe Vert,et al.  Supervised reconstruction of biological networks with local models , 2007, ISMB/ECCB.

[5]  Céline Rouveirol,et al.  LICORN: learning cooperative regulation networks from gene expression data , 2007, Bioinform..

[6]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[7]  Jessika Weiss,et al.  Graphical Models In Applied Multivariate Statistics , 2016 .

[8]  Michael P. H. Stumpf,et al.  Statistical inference of the time-varying structure of gene-regulation networks , 2010, BMC Systems Biology.

[9]  D. Pinkel,et al.  Regional copy number–independent deregulation of transcription in cancer , 2006, Nature Genetics.

[10]  J. Friedman,et al.  New Insights and Faster Computations for the Graphical Lasso , 2011 .

[11]  Hyun Jung Park,et al.  FoxM1: a master regulator of tumor metastasis. , 2011, Cancer research.

[12]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[13]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[14]  François Radvanyi,et al.  Network Transformation of Gene Expression for Feature Extraction , 2012, 2012 11th International Conference on Machine Learning and Applications.

[15]  Céline Rouveirol,et al.  Unsupervised Learning for Gene Regulation Network Inference from Expression Data: A Review , 2010 .

[16]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Pierre Geurts,et al.  Inferring biological networks with output kernel trees , 2007, BMC Bioinformatics.

[19]  Zhe Zhang,et al.  High FOXM1 expression was associated with bladder carcinogenesis , 2013, Tumor Biology.

[20]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.

[21]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[22]  Mariano J. Alvarez,et al.  A human B-cell interactome identifies MYB and FOXM1 as master regulators of proliferation in germinal centers , 2010, Molecular systems biology.

[23]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[24]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[25]  Tomasz Arodz,et al.  ADANET: inferring gene regulatory networks using ensemble classifiers , 2012, BCB.

[26]  Wing-Kin Sung,et al.  Cellular reprogramming by the conjoint action of ERα, FOXA1, and GATA3 to a ligand-inducible growth state , 2011, Molecular systems biology.

[27]  Yves Grandvalet,et al.  Sparsity with sign-coherent groups of variables via the cooperative-Lasso , 2011, The Annals of Applied Statistics.

[28]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[29]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[30]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[31]  Amos Tanay,et al.  Minreg: Inferring an active regulator set , 2002, ISMB.

[32]  Jean-Philippe Vert,et al.  TIGRESS: Trustful Inference of Gene REgulation using Stability Selection , 2012, BMC Systems Biology.

[33]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[34]  Yvan Saeys,et al.  Robust Feature Selection Using Ensemble Feature Selection Techniques , 2008, ECML/PKDD.

[35]  Shahrokh F. Shariat,et al.  Loss of the Urothelial Differentiation Marker FOXA1 Is Associated with High Grade, Late Stage Bladder Cancer and Increased Tumor Proliferation , 2012, PloS one.

[36]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[37]  Wei-Po Lee,et al.  Computational methods for discovering gene networks from expression data , 2009, Briefings Bioinform..