Comparative study of GRNS inference methods based on feature selection by mutual information

Feature selection is a crucial topic in pattern recognition applications, especially in the genetic regulatory networks (GRNs) inference problem which usually involves data with a large number of variables and small number of observations. In this context, the application of dimensionality reduction approaches such as those based on feature selection becomes a mandatory step in order to select the most important predictor genes that can explain some phenomena associated with the target genes. Given its importance in GRN inference, many feature selection methods (algorithms and criterion functions) have been proposed. However, it is decisive to validate such results in order to better understand its significance. The present work proposes a comparative study of feature selection techniques involving information theory concepts, applied to the estimation of GRNs from simulated temporal expression data generated by an artificial gene network (AGN) model. Four GRN inference methods are compared in terms of global network measures. Some interesting conclusions can be drawn from the experimental results.

[1]  Stuart A. Kauffman,et al.  ORIGINS OF ORDER , 2019, Origins of Order.

[2]  P.D. Cristea,et al.  Genomic signal processing , 2004, 7th Seminar on Neural Network Applications in Electrical Engineering, 2004. NEUREL 2004. 2004.

[3]  Yi Pan,et al.  Computational Intelligence in Bioinformatics , 2007 .

[4]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[5]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[6]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[7]  Michael B. Eisen,et al.  Identification of regulatory elements using a feature selection method , 2002, Bioinform..

[8]  David Correa Martins,et al.  Feature selection environment for genomic applications , 2008, BMC Bioinformatics.

[9]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[10]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[11]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[12]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[14]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[15]  Luciano da Fontoura Costa,et al.  Predicting the connectivity of primate cortical networks from topological and spatial node properties , 2007, BMC Systems Biology.

[16]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[17]  L. da F. Costa,et al.  Characterization of complex networks: A survey of measurements , 2005, cond-mat/0505185.

[18]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[19]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[20]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[21]  Roberto Marcondes Cesar Junior,et al.  AGN Simulation and Validation Model , 2008, BSB.