Gene Networks Inference through Linear Grouping of Variables

The inference of gene networks from gene expression data is an open problem due to the large dimensionality (number of genes) and the small number of data samples typically available, even considering the fact that the network is sparse (limited number of input genes per target gene). In this work we propose a method that alleviates the curse of dimensionality by grouping predictor gene configurations in their respective linear combination values. Each linear combination value results in an equivalence class. In this way, the number of configurations of predictor values becomes a linear function of the dimensionality (number of predictors) instead of an exponential function when considering the original configurations. The proposed method follows the probabilistic gene networks approach which applies local feature selection to obtain an adequate predictor gene set for each gene. Even considering that some information from the original configurations of predictors is lost after applying the grouping, the results indicate that the inference with linear grouping tends to provide networks with better topological similarities than those obtained without grouping in cases where the number of samples is quite limited and the inference involves a larger number of predictors per gene.

[1]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[2]  Emanuela Merelli,et al.  An intelligent agents architecture for DNA-microarray data integration , 2001 .

[3]  Sergios Theodoridis,et al.  Pattern Recognition , 1998, IEEE Trans. Neural Networks.

[4]  D. Thieffry,et al.  A logical analysis of the Drosophila gap-gene system. , 2001, Journal of theoretical biology.

[5]  Sangsoo Kim,et al.  An efficient top-down search algorithm for learning Boolean networks of gene expression , 2006, Machine Learning.

[6]  Yufei Huang,et al.  Genomic Signal Processing , 2012, IEEE Signal Processing Magazine.

[7]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[8]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[9]  Satoru Miyano,et al.  Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model , 1998, Pacific Symposium on Biocomputing.

[10]  Ji Huang,et al.  [Serial analysis of gene expression]. , 2002, Yi chuan = Hereditas.

[11]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[12]  David Correa Martins,et al.  A feature selection technique for inference of graphs from their known topological properties: Revealing scale-free gene regulatory networks , 2014, Inf. Sci..

[13]  Edward R. Dougherty,et al.  Steady-state probabilities for attractors in probabilistic Boolean networks , 2005, Signal Process..

[14]  Blagoj Ristevski,et al.  A survey of models for inference of gene regulatory networks , 2013 .

[15]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[16]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[17]  P. Brown,et al.  A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. , 1996, Genome research.

[18]  Ilya Shmulevich,et al.  On Learning Gene Regulatory Networks Under the Boolean Network Model , 2003, Machine Learning.

[19]  Aurélien Naldi,et al.  Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle , 2006, ISMB.

[20]  Mark P. Styczynski,et al.  Overview of computational methods for the inference of gene regulatory networks , 2005, Comput. Chem. Eng..

[21]  Q. Ouyang,et al.  The yeast cell-cycle network is robustly designed. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Edward R. Dougherty,et al.  Validation of gene regulatory networks: scientific and inferential , 2011, Briefings Bioinform..

[23]  E. Dougherty,et al.  MODELING GENETIC REGULATORY NETWORKS: CONTINUOUS OR DISCRETE? , 2006 .

[24]  S. Kauffman Homeostasis and Differentiation in Random Genetic Control Networks , 1969, Nature.

[25]  S. Bornholdt,et al.  Boolean Network Model Predicts Cell Cycle Sequence of Fission Yeast , 2007, PloS one.

[26]  David Correa Martins,et al.  Feature selection environment for genomic applications , 2008, BMC Bioinformatics.

[27]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[28]  E. McCluskey Minimization of Boolean functions , 1956 .

[29]  Yuehui Chen,et al.  Computational Intelligence in Bioinformatics , 2008, Computational Intelligence in Bioinformatics.

[30]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[31]  Shubhra Sankar Ray,et al.  Entropic Biological Score: a cell cycle investigation for GRNs inference. , 2014, Gene.

[32]  Minping Qian,et al.  Stochastic model of yeast cell-cycle network , 2006, q-bio/0605011.

[33]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[34]  Edward R. Dougherty,et al.  Multiresolution Analysis for Optimal Binary Filters , 2001, Journal of Mathematical Imaging and Vision.

[35]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.

[36]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[37]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[38]  C. Espinosa-Soto,et al.  A Gene Regulatory Network Model for Cell-Fate Determination during Arabidopsis thaliana Flower Development That Is Robust and Recovers Experimental Gene Expression Profilesw⃞ , 2004, The Plant Cell Online.