Gene Expression Complex Networks: Synthesis, Identification, and Analysis

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree variation, decreasing its network recovery rate with the increase of . The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

[1]  R. Somogyi,et al.  Gene Expression Data Analysis and Modeling , 1999 .

[2]  David Correa Martins,et al.  W-operator window design by minimization of mean conditional entropy , 2006, Pattern Analysis and Applications.

[3]  Hidde de Jong,et al.  Genetic Network Analyzer: qualitative simulation of genetic regulatory networks , 2003, Bioinform..

[4]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[5]  Alberto de la Fuente,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004, Bioinform..

[6]  David Correa Martins,et al.  Comparative study of GRNS inference methods based on feature selection by mutual information , 2009, 2009 IEEE International Workshop on Genomic Signal Processing and Statistics.

[7]  A. Barabasi,et al.  The topology of the transcription regulatory network in the yeast , 2002, cond-mat/0205181.

[8]  R. Albert,et al.  Predicting Essential Components of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling , 2006, PLoS biology.

[9]  Raghunathan Rengaswamy,et al.  Structural Properties of Gene Regulatory Networks: Definitions and Connections , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[10]  Steffen Klamt,et al.  Structural and functional analysis of cellular networks with CellNetAnalyzer , 2007, BMC Systems Biology.

[11]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[12]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[13]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[14]  Roberto Marcondes Cesar Junior,et al.  AGN Simulation and Validation Model , 2008, BSB.

[15]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[16]  M. Hetzer,et al.  Eukaryotic cells. , 2011, Current opinion in cell biology.

[17]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[18]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[19]  H. Okayama Mammalian Cell Cycle. , 1993 .

[20]  S. Bornholdt,et al.  Boolean Network Model Predicts Cell Cycle Sequence of Fission Yeast , 2007, PloS one.

[21]  Edward R Dougherty,et al.  Validation of Inference Procedures for Gene Regulatory Networks , 2007, Current genomics.

[22]  David Correa Martins,et al.  Feature selection environment for genomic applications , 2008, BMC Bioinformatics.

[23]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[24]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[25]  Ilya Shmulevich,et al.  Eukaryotic cells are dynamically ordered or critical but not chaotic. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Alvis Brazma,et al.  Current approaches to gene regulatory network modelling , 2007, BMC Bioinformatics.

[27]  Luciano da Fontoura Costa,et al.  Complex networks: The key to systems biology , 2008 .

[28]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[29]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[30]  P. Bourgine,et al.  Topological and causal structure of the yeast transcriptional regulatory network , 2002, Nature Genetics.

[31]  Kathleen Marchal,et al.  SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms , 2006, BMC Bioinformatics.

[32]  Michael R. Brent,et al.  Benchmarking regulatory network reconstruction with GRENDEL , 2009, Bioinform..

[33]  A. Lehninger Principles of Biochemistry , 1984 .

[34]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[35]  Claudio Altafini,et al.  Comparing association network algorithms for reverse engineering of large-scale gene regulatory networks: synthetic versus real data , 2007, Bioinform..

[36]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.

[37]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[38]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[39]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[40]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[41]  Michael T. Gastner,et al.  The spatial structure of networks , 2006 .

[42]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[43]  S Bullock,et al.  Modelling the evolution of genetic regulatory networks. , 2006, Journal of theoretical biology.

[44]  Carsten Peterson,et al.  Random Boolean network models and the yeast transcriptional network , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Song Li,et al.  Boolean network simulations for life scientists , 2008, Source Code for Biology and Medicine.

[46]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[47]  M Wahde,et al.  Coarse-grained reverse engineering of genetic regulatory networks. , 2000, Bio Systems.

[48]  D. Thieffry,et al.  A logical analysis of the Drosophila gap-gene system. , 2001, Journal of theoretical biology.

[49]  S. Carroll,et al.  From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design , 2000 .

[50]  R. Rengaswamy,et al.  Structural Properties of Gene Regulatory Networks: Definitions and Connections , 2009, TCBB.

[51]  Roberto Lucchetti,et al.  Microarray Data Analysis via Weighted Indices and Weighted Majority Games , 2009, CIBB.

[52]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[53]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[54]  Charlie Hodgman,et al.  Inference of Gene Regulatory Networks Using Boolean-Network Inference Methods , 2009, J. Bioinform. Comput. Biol..

[55]  Yufei Huang,et al.  Genomic Signal Processing , 2012, IEEE Signal Processing Magazine.

[56]  Edward R. Dougherty,et al.  From Boolean to probabilistic Boolean networks as models of genetic regulatory networks , 2002, Proc. IEEE.

[57]  Lei M. Li,et al.  Explore Biological Pathways from Noisy Array Data by Directed Acyclic Boolean Networks , 2005, J. Comput. Biol..

[58]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[59]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[60]  C. Espinosa-Soto,et al.  A Gene Regulatory Network Model for Cell-Fate Determination during Arabidopsis thaliana Flower Development That Is Robust and Recovers Experimental Gene Expression Profilesw⃞ , 2004, The Plant Cell Online.

[61]  Hans C. van Houwelingen,et al.  Microarray Data Analysis , 2004 .

[62]  Aurélien Naldi,et al.  Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle , 2006, ISMB.

[63]  M Villani,et al.  Genetic network models and statistical properties of gene expression data in knock-out experiments. , 2004, Journal of theoretical biology.

[64]  Mark P. Styczynski,et al.  Overview of computational methods for the inference of gene regulatory networks , 2005, Comput. Chem. Eng..

[65]  Q. Ouyang,et al.  The yeast cell-cycle network is robustly designed. , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[67]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[68]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.