Computing Consistency Between Microarray Data and Known Gene Regulation Relationships

Microarray experiments produce expression patterns for thousands of genes at once. On the other hand, biomedical literature contains large amounts of gene regulation relationship information accumulated over the years. One obvious requirement is an automated way of comparing microarray data with the collection of known gene regulation relationships. Such an automated comparison is imperative because it can help biologists rapidly understand the context of a given microarray experiment. In addition, the consistency measure can be used to either validate or refute the hypothesis being tested using the microarray experiment. In this paper we present a systematic way of examining the consistency between a given set of microarray data and known gene regulation relationships. We first introduce a simple gene regulation network model with two separate algorithms designed to isolate a maximally consistent network. Subsequently, we extend the model to take into account multiple regulating factors for a single gene while highlighting both consistencies and inconsistencies. We illustrate the effectiveness of our approach with two practical examples, one that picks the peroxisome proliferator-activated receptor (PPAR) pathway as highly consistent from multiple pathways of Kyoto encyclopedia of genes and genomes (KEGG), and another that isolates key regulatory relationships involving nfkb1 and others known for macrophage's counter response to inflammation.

[1]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[2]  Peter D. Karp,et al.  The EcoCyc Database , 2002, Nucleic Acids Res..

[3]  J. Baumbach,et al.  Linking Cytoscape and the corynebacterial reference database CoryneRegNet , 2008, BMC Genomics.

[4]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[5]  Steven C. Lawlor,et al.  GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways , 2002, Nature Genetics.

[6]  T. Speed,et al.  GOstat: find statistically overrepresented Gene Ontologies within a group of genes. , 2004, Bioinformatics.

[7]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[8]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[9]  A. Valencia,et al.  A gene network for navigating the literature , 2004, Nature Genetics.

[10]  Peter D. Karp,et al.  The MetaCyc Database , 2002, Nucleic Acids Res..

[11]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[12]  Luke E. K. Achenie,et al.  Expression Profile of Osteoblast Lineage at Defined Stages of Differentiation* , 2005, Journal of Biological Chemistry.

[13]  Pan Du,et al.  lumi: a pipeline for processing Illumina microarray , 2008, Bioinform..

[14]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[15]  Sergei Egorov,et al.  MedScan, a natural language processing engine for MEDLINE abstracts , 2003, Bioinform..

[16]  M. Orešič,et al.  Pathways to the analysis of microarray data. , 2005, Trends in biotechnology.

[17]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .