Entropic Biological Score: a cell cycle investigation for GRNs inference.

Inference of gene regulatory networks (GRNs) is one of the most challenging research problems of Systems Biology. In this investigation, a new GRNs inference methodology, called Entropic Biological Score (EBS), which linearly combines the mean conditional entropy (MCE) from expression levels and a Biological Score (BS), obtained by integrating different biological data sources, is proposed. The EBS is validated with the Cell Cycle related functional annotation information, available from Munich Information Center for Protein Sequences (MIPS), and compared with some existing methods like MRNET, ARACNE, CLR and MCE for GRNs inference. For real networks, the performance of EBS, which uses the concept of integrating different data sources, is found to be superior to the aforementioned inference methods. The best results for EBS are obtained by considering the weights w1=0.2 and w2=0.8 for MCE and BS values, respectively, where approximately 40% of the inferred connections are found to be correct and significantly better than related methods. The results also indicate that expression profile is able to recover some true connections, that are not present in biological annotations, thus leading to the possibility of discovering new relations between its genes.

[1]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[2]  T. Pawson,et al.  Assembly of Cell Regulatory Systems Through Protein Interaction Domains , 2003, Science.

[3]  David Correa Martins,et al.  A feature selection technique for inference of graphs from their known topological properties: Revealing scale-free gene regulatory networks , 2014, Inf. Sci..

[4]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Benno Schwikowski,et al.  Graph-based methods for analysing networks in cell biology , 2006, Briefings Bioinform..

[6]  David Correa Martins,et al.  SFFS-MR: A Floating Search Strategy for GRNs Inference , 2010, PRIB.

[7]  Shubhra Sankar Ray,et al.  HD-RNAS: An Automated Hierarchical Database of RNA Structures , 2012, Front. Gene..

[8]  Jacob de Vlieg,et al.  Integrating gene expression and GO classification for PCA by preclustering , 2010, BMC Bioinformatics.

[9]  Edward R Dougherty,et al.  Validation of Inference Procedures for Gene Regulatory Networks , 2007, Current genomics.

[10]  Yufei Huang,et al.  Genomic Signal Processing , 2012, IEEE Signal Processing Magazine.

[11]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[12]  Tong Wang,et al.  TF-finder: A software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base , 2010, BMC Bioinformatics.

[13]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  J. Tyson,et al.  Computational Cell Biology , 2010 .

[15]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[16]  Simon Lin,et al.  Methods of microarray data analysis III , 2002 .

[17]  Peter B. McGarvey,et al.  The Protein Information Resource (PIR) , 2000, Nucleic Acids Res..

[18]  Andreas Tauch,et al.  Towards the integrated analysis, visualization and reconstruction of microbial gene regulatory networks , 2008, Briefings Bioinform..

[19]  Roberto Marcondes Cesar Junior,et al.  Gene Expression Complex Networks: Synthesis, Identification, and Analysis , 2011, J. Comput. Biol..

[20]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[21]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[22]  Fabricio M. Lopes,et al.  Assessing the gain of biological data integration in gene networks inference , 2012, BMC Genomics.

[23]  Alexander Schliep,et al.  ProClust: improved clustering of protein sequences with an extended graph-based approach , 2002, ECCB.

[24]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[25]  P. Brown,et al.  A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. , 1996, Genome research.

[26]  Qicheng Ma,et al.  Clustering protein sequences with a novel metric transformed from sequence similarity scores and sequence alignments with neural networks , 2005, BMC Bioinformatics.

[27]  T. H. Bø,et al.  LSimpute: accurate estimation of missing values in microarray data with least squares methods. , 2004, Nucleic acids research.

[28]  Edward R. Dougherty,et al.  The fundamental role of pattern recognition for gene-expression/microarray data in bioinformatics , 2005, Pattern Recognit..

[29]  Matthias Dehmer,et al.  Interfacing cellular networks of S. cerevisiae and E. coli: Connecting dynamic and genetic information , 2013, BMC Genomics.

[30]  Guy Karlebach,et al.  Modelling and analysis of gene regulatory networks , 2008, Nature Reviews Molecular Cell Biology.

[31]  J. Collins,et al.  Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles , 2007, PLoS biology.

[32]  H. Mewes,et al.  The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. , 2004, Nucleic acids research.

[33]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[34]  James Bailey,et al.  Using Gene Ontology annotations in exploratory microarray clustering to understand cancer etiology , 2010, Pattern Recognit. Lett..

[35]  A. Lehninger Principles of Biochemistry , 1984 .

[36]  D. Boomsma,et al.  Regular Exercise, Subjective Wellbeing, and Internalizing Problems in Adolescence: Causality or Genetic Pleiotropy? , 2012, Front. Gene..

[37]  Galina V. Glazko,et al.  Statistical Inference and Reverse Engineering of Gene Regulatory Networks from Observational Expression Data , 2012, Front. Gene..

[38]  Michael Hecker,et al.  Gene regulatory network inference: Data integration in dynamic models - A review , 2009, Biosyst..

[39]  David Correa Martins,et al.  Constructing Probabilistic Genetic Networks of Plasmodium falciparum from Dynamical Expression Signals of the Intraerythrocytic Development Cycle , 2007 .

[40]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[41]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[42]  Junhee Seok,et al.  Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships , 2010, BMC Bioinformatics.

[43]  D. Haussler,et al.  Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. , 1998, Journal of molecular biology.

[44]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[45]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[46]  Cathy H. Wu,et al.  The Universal Protein Resource (UniProt) , 2004, Nucleic Acids Res..

[47]  Roberto Marcondes Cesar Junior,et al.  Inference of gene regulatory networks from time series by Tsallis entropy , 2011, BMC Systems Biology.

[48]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[49]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[50]  Sanghamitra Bandyopadhyay,et al.  Combining Multisource Information Through Functional-Annotation-Based Weighting: Gene Function Prediction in Yeast , 2009, IEEE Transactions on Biomedical Engineering.

[51]  Sanghamitra Bandyopadhyay,et al.  A Weighted Power Framework for Integrating Multisource Information: Gene Function Prediction in Yeast , 2012, IEEE Transactions on Biomedical Engineering.

[52]  Ziv Bar-Joseph,et al.  A Semi-Supervised Method for Predicting Transcription Factor–Gene Interactions in Escherichia coli , 2008, PLoS Comput. Biol..

[53]  Albert-László Barabási,et al.  Linked: The New Science of Networks , 2002 .

[54]  Gianluca Bontempi,et al.  minet: A R/Bioconductor Package for Inferring Large Transcriptional Networks Using Mutual Information , 2008, BMC Bioinformatics.

[55]  Ilya Shmulevich,et al.  Eukaryotic cells are dynamically ordered or critical but not chaotic. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[56]  Gavin Sherlock,et al.  Global analysis of gene function in yeast by quantitative phenotypic profiling , 2006, Molecular systems biology.

[57]  K. Kinzler,et al.  Serial Analysis of Gene Expression , 1995, Science.

[58]  Olga G. Troyanskaya,et al.  Putting microarrays in a context: Integrated analysis of diverse biological data , 2005, Briefings Bioinform..