Inference of gene regulatory networks from time series by Tsallis entropy

BackgroundThe inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed.ResultsIn this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes.ConclusionsA remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/.

[1]  Franck Picard,et al.  Assessing the Exceptionality of Network Motifs , 2007, J. Comput. Biol..

[2]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[3]  R. Albert Scale-free networks in cell biology , 2005, Journal of Cell Science.

[4]  Edward R Dougherty,et al.  Validation of Inference Procedures for Gene Regulatory Networks , 2007, Current genomics.

[5]  Roberto Santos,et al.  Generalization of Shannon’s theorem for Tsallis entropy , 1997 .

[6]  S. Bornholdt,et al.  Boolean Network Model Predicts Cell Cycle Sequence of Fission Yeast , 2007, PloS one.

[7]  David Correa Martins,et al.  Feature selection environment for genomic applications , 2008, BMC Bioinformatics.

[8]  Constantino Tsallis,et al.  Extensivity and entropy production , 2005 .

[9]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  Roberto Marcondes Cesar Junior,et al.  AGN Simulation and Validation Model , 2008, BSB.

[11]  Frank Emmert-Streib,et al.  Inferring the conservative causal core of gene regulatory networks , 2010, BMC Systems Biology.

[12]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..

[13]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[14]  Simon Lin,et al.  Methods of microarray data analysis III , 2002 .

[15]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[16]  D. di Bernardo,et al.  How to infer gene networks from expression profiles , 2007, Molecular systems biology.

[17]  Christoph Kaleta,et al.  Integrative inference of gene-regulatory networks in Escherichia coli using information theoretic concepts and sequence analysis , 2010, BMC Systems Biology.

[18]  Aleksandr Yakovlevich Khinchin,et al.  Mathematical foundations of information theory , 1959 .

[19]  S. Furuichi Information theoretical properties of Tsallis entropies , 2004, cond-mat/0405600.

[20]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[21]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[22]  David Correa Martins,et al.  Comparative study of GRNS inference methods based on feature selection by mutual information , 2009, 2009 IEEE International Workshop on Genomic Signal Processing and Statistics.

[23]  R. Clausius The Mechanical Theory Of Heat , 1879 .

[24]  S. Ledermann Kullback S. — Information Theory and Statistics , 1962 .

[25]  Roberto Marcondes Cesar Junior,et al.  Gene Expression Complex Networks: Synthesis, Identification, and Analysis , 2011, J. Comput. Biol..

[26]  Sumiyoshi Abe Tsallis entropy: how unique? , 2004 .

[27]  Carsten Peterson,et al.  Random Boolean network models and the yeast transcriptional network , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  David Correa Martins,et al.  W-operator window design by minimization of mean conditional entropy , 2006, Pattern Analysis and Applications.

[29]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[30]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[31]  Andrew R. Webb,et al.  Statistical Pattern Recognition , 1999 .

[32]  C. Tsallis Generalized entropy-based criterion for consistent testing , 1998 .

[33]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[34]  Marina Meila,et al.  IB, NF-B Regulation Model: Simulation Analysis of Small Number of Molecules , 2008, EURASIP J. Bioinform. Syst. Biol..

[35]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[36]  S. Abe Axioms and uniqueness theorem for Tsallis entropy , 2000, cond-mat/0005538.

[37]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[38]  M Villani,et al.  Genetic network models and statistical properties of gene expression data in knock-out experiments. , 2004, Journal of theoretical biology.

[39]  Chin-Rang Yang,et al.  Learning biological network using mutual information and conditional independence , 2010, BMC Bioinformatics.

[40]  Roberto Marcondes Cesar Junior,et al.  Analysis of the GRNs Inference by Using Tsallis Entropy and a Feature Selection Approach , 2009, CIARP.

[41]  Yufei Huang,et al.  Genomic Signal Processing , 2012, IEEE Signal Processing Magazine.

[42]  C. Tsallis,et al.  Information gain within nonextensive thermostatistics , 1998 .

[43]  Grzegorz Wilk,et al.  Example of a possible interpretation of Tsallis entropy , 2007, 0711.3348.

[44]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[45]  E. Dougherty,et al.  Inferring Connectivity of Genetic Regulatory Networks Using Information-Theoretic Criteria , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[46]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[47]  R. Albert,et al.  Predicting Essential Components of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling , 2006, PLoS biology.

[48]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[49]  R. Gray Entropy and Information Theory , 1990, Springer New York.

[50]  Chaoyang Zhang,et al.  A novel gene network inference algorithm using predictive minimum description length approach , 2010, BMC Systems Biology.

[51]  Kevin Kontos,et al.  Information-Theoretic Inference of Large Transcriptional Regulatory Networks , 2007, EURASIP J. Bioinform. Syst. Biol..

[52]  Tsz Chung Au DNA Microarray Data Analysis , 2003 .

[53]  Weiru Liu,et al.  Learning belief networks from data: an information theory based approach , 1997, CIKM '97.

[54]  Constantino Tsallis,et al.  What should a statistical mechanics satisfy to reflect nature , 2004, cond-mat/0403012.

[55]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[56]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[57]  BMC Systems Biology , 2007 .

[58]  Jaakko Astola,et al.  Inference of Gene Regulatory Networks Based on a Universal Minimum Description Length , 2008, EURASIP J. Bioinform. Syst. Biol..

[59]  Ilya Shmulevich,et al.  Eukaryotic cells are dynamically ordered or critical but not chaotic. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[61]  Claudio Altafini,et al.  Comparing association network algorithms for reverse engineering of large-scale gene regulatory networks: synthetic versus real data , 2007, Bioinform..