Analysis of the GRNs Inference by Using Tsallis Entropy and a Feature Selection Approach

An important problem in the bioinformatics field is to understand how genes are regulated and interact through gene networks. This knowledge can be helpful for many applications, such as disease treatment design and drugs creation purposes. For this reason, it is very important to uncover the functional relationship among genes and then to construct the gene regulatory network (GRN) from temporal expression data. However, this task usually involves data with a large number of variables and small number of observations. In this way, there is a strong motivation to use pattern recognition and dimensionality reduction approaches. In particular, feature selection is specially important in order to select the most important predictor genes that can explain some phenomena associated with the target genes. This work presents a first study about the sensibility of entropy methods regarding the entropy functional form, applied to the problem of topology recovery of GRNs. The generalized entropy proposed by Tsallis is used to study this sensibility. The inference process is based on a feature selection approach, which is applied to simulated temporal expression data generated by an artificial gene network (AGN) model. The inferred GRNs are validated in terms of global network measures. Some interesting conclusions can be drawn from the experimental results, as reported for the first time in the present paper.

[1]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[2]  Constantino Tsallis,et al.  Nonadditive entropy: The concept and its use , 2008, 0812.4370.

[3]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[4]  E. Dougherty,et al.  Inferring Connectivity of Genetic Regulatory Networks Using Information-Theoretic Criteria , 2008, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  Remarks about the Tsallis formalism. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[6]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[7]  David Correa Martins,et al.  Feature selection environment for genomic applications , 2008, BMC Bioinformatics.

[8]  Catalin C. Barbacioru,et al.  The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies , 2008, BMC Bioinformatics.

[9]  Carlos Eduardo Ferreira,et al.  Advances in Bioinformatics and Computational Biology, 5th Brazilian Symposium on Bioinformatics, BSB 2010, Rio de Janeiro, Brazil, August 31-September 3, 2010. Proceedings , 2010, BSB.

[10]  Michael B. Eisen,et al.  Identification of regulatory elements using a feature selection method , 2002, Bioinform..

[11]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[12]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[13]  David G. Stork,et al.  Pattern Classification , 1973 .

[14]  David Correa Martins,et al.  Comparative study of GRNS inference methods based on feature selection by mutual information , 2009, 2009 IEEE International Workshop on Genomic Signal Processing and Statistics.

[15]  S. Abe,et al.  Nonextensive Statistical Mechanics and Its Applications , 2010 .

[16]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[17]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[18]  Luciano da Fontoura Costa,et al.  Predicting the connectivity of primate cortical networks from topological and spatial node properties , 2007, BMC Systems Biology.

[19]  p. d. moerland DNA Microarray Data Analysis , 2008 .

[20]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[21]  Edward R Dougherty,et al.  Validation of Inference Procedures for Gene Regulatory Networks , 2007, Current genomics.

[22]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[23]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[24]  Sumiyoshi Abe Tsallis entropy: how unique? , 2004 .

[25]  P. Steerenberg,et al.  Targeting pathophysiological rhythms: prednisone chronotherapy shows sustained efficacy in rheumatoid arthritis. , 2010, Annals of the rheumatic diseases.

[26]  Constantino Tsallis,et al.  Special issue overview Nonextensive statistical mechanics: new trends, new perspectives , 2005 .

[27]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[28]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[29]  Roberto Marcondes Cesar Junior,et al.  AGN Simulation and Validation Model , 2008, BSB.

[30]  Kevin Kontos,et al.  Information-Theoretic Inference of Large Transcriptional Regulatory Networks , 2007, EURASIP J. Bioinform. Syst. Biol..

[31]  Giorgio Benedek,et al.  Nonextensive statistical mechanics : new trends , new perspectives , 2005 .

[32]  A. Louisa,et al.  コロイド混合体における有効力 空乏引力から集積斥力へ | 文献情報 | J-GLOBAL 科学技術総合リンクセンター , 2002 .

[33]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[34]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[35]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[36]  María Suárez,et al.  Computational design of proteins with new functions , 2007, BMC Systems Biology.