Dissecting complex transcriptional responses using pathway-level scores based on prior information

BackgroundThe genomewide pattern of changes in mRNA expression measured using DNA microarrays is typically a complex superposition of the response of multiple regulatory pathways to changes in the environment of the cells. The use of prior information, either about the function of the protein encoded by each gene, or about the physical interactions between regulatory factors and the sequences controlling its expression, has emerged as a powerful approach for dissecting complex transcriptional responses.ResultsWe review two different approaches for combining the noisy expression levels of multiple individual genes into robust pathway-level differential expression scores. The first is based on a comparison between the distribution of expression levels of genes within a predefined gene set and those of all other genes in the genome. The second starts from an estimate of the strength of genomewide regulatory network connectivities based on sequence information or direct measurements of protein-DNA interactions, and uses regression analysis to estimate the activity of gene regulatory pathways. The statistical methods used are explained in detail.ConclusionBy avoiding the thresholding of individual genes, pathway-level analysis of differential expression based on prior information can be considerably more sensitive to subtle changes in gene expression than gene-level analysis. The methods are technically straightforward and yield results that are easily interpretable, both biologically and statistically.

[1]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[2]  Martin Vingron,et al.  An Improved Statistic for Detecting Over-Represented Gene Ontology Annotations in Gene Sets , 2006, RECOMB.

[3]  Dat H. Nguyen,et al.  Deciphering principles of transcription regulation in eukaryotic genomes , 2006, Molecular systems biology.

[4]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[5]  Frank Klawonn,et al.  JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology information , 2006, Nucleic Acids Res..

[6]  Gary D. Stormo,et al.  DNA binding sites: representation and discovery , 2000, Bioinform..

[7]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[8]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[9]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[10]  Feng Gao,et al.  Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data , 2004, BMC Bioinformatics.

[11]  J. Collado-Vides,et al.  Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. , 1998, Journal of molecular biology.

[12]  Barrett C. Foat,et al.  Profiling condition-specific, genome-wide regulation of mRNA stability in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Daniel J. Vis,et al.  T-profiler: scoring the activity of predefined groups of genes using gene expression data , 2005, Nucleic Acids Res..

[14]  Jun S. Liu,et al.  Integrating regulatory motif discovery and genome-wide expression analysis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Purvesh Khatri,et al.  Ontological analysis of gene expression data: current tools, limitations, and open problems , 2005, Bioinform..

[16]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[17]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Seon-Young Kim,et al.  PAGE: Parametric Analysis of Gene Set Enrichment , 2005, BMC Bioinform..

[19]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[20]  Yoav Freund,et al.  Predicting genetic regulatory response using classification , 2004, ISMB/ECCB.

[21]  Alexandre V. Morozov,et al.  Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCE , 2006, ISMB.

[22]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[23]  Michael A. Beer,et al.  Predicting Gene Expression from Sequence , 2004, Cell.

[24]  André Boorsma,et al.  Hap4p overexpression in glucose-grown Saccharomyces cerevisiae induces cells to enter a novel metabolic state , 2002, Genome Biology.

[25]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[26]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[27]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.