Maximization of negative correlations in time-course gene expression data for enhancing understanding of molecular pathways

Positive correlation can be diversely instantiated as shifting, scaling or geometric pattern, and it has been extensively explored for time-course gene expression data and pathway analysis. Recently, biological studies emerge a trend focusing on the notion of negative correlations such as opposite expression patterns, complementary patterns and self-negative regulation of transcription factors (TFs). These biological ideas and primitive observations motivate us to formulate and investigate the problem of maximizing negative correlations. The objective is to discover all maximal negative correlations of statistical and biological significance from time-course gene expression data for enhancing our understanding of molecular pathways. Given a gene expression matrix, a maximal negative correlation is defined as an activation–inhibition two-way expression pattern (AIE pattern). We propose a parameter-free algorithm to enumerate the complete set of AIE patterns from a data set. This algorithm can identify significant negative correlations that cannot be identified by the traditional clustering/biclustering methods. To demonstrate the biological usefulness of AIE patterns in the analysis of molecular pathways, we conducted deep case studies for AIE patterns identified from Yeast cell cycle data sets. In particular, in the analysis of the Lysine biosynthesis pathway, new regulation modules and pathway components were inferred according to a significant negative correlation which is likely caused by a co-regulation of the TFs at the higher layer of the biological network. We conjecture that maximal negative correlations between genes are actually a common characteristic in molecular pathways, which can provide insights into the cell stress response study, drug response evaluation, etc.

[1]  Jesús S. Aguilar-Ruiz,et al.  Shifting and scaling patterns from gene expression data , 2005, Bioinform..

[2]  C. Missero,et al.  Multiple Ras Downstream Pathways Mediate Functional Repression of the Homeobox Gene Product TTF-1 , 2000, Molecular and Cellular Biology.

[3]  Lothar Thiele,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006, Bioinform..

[4]  Arlindo L. Oliveira,et al.  Identification of Regulatory Modules in Time Series Gene Expression Data Using a Linear Time Biclustering Algorithm , 2010, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[5]  J. Bouchara,et al.  A Nonsense Mutation in the ERG6 Gene Leads to Reduced Susceptibility to Polyenes in a Clinical Isolate of Candida glabrata , 2008, Antimicrobial Agents and Chemotherapy.

[6]  M. Esteller Cancer epigenomics: DNA methylomes and histone-modification maps , 2007, Nature Reviews Genetics.

[7]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[8]  Ulf Stahl,et al.  Combined overexpression of genes of the ergosterol biosynthetic pathway leads to accumulation of sterols in Saccharomyces cerevisiae. , 2003, FEMS yeast research.

[9]  Kian-Lee Tan,et al.  Identifying time-lagged gene clusters using gene expression data , 2005, Bioinform..

[10]  Wen-Hsiung Li,et al.  Systematic identification of yeast cell cycle transcription factors using multiple data sources , 2008, BMC Bioinformatics.

[11]  Grace S. Shieh,et al.  Inferring transcriptional compensation interactions in yeast via stepwise structure equation modeling , 2008, BMC Bioinformatics.

[12]  Michael Grunstein,et al.  Genome-wide patterns of histone modifications in yeast , 2006, Nature Reviews Molecular Cell Biology.

[13]  ThieleLothar,et al.  A systematic comparison and evaluation of biclustering methods for gene expression data , 2006 .

[14]  Chang-Tsun Li,et al.  Partial mixture model for tight clustering of gene expression time-course , 2007, BMC Bioinformatics.

[15]  Frank Beier,et al.  Microarray analyses of gene expression during chondrocyte differentiation identifies novel regulators of hypertrophy. , 2005, Molecular biology of the cell.

[16]  Qiwei Yang,et al.  Modulation of renal-specific oxidoreductase/myo-inositol oxygenase by high-glucose ambience. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[17]  David Botstein,et al.  SGD: Saccharomyces Genome Database , 1998, Nucleic Acids Res..

[18]  Andreas Zell,et al.  EDISA: extracting biclusters from multiple time-series of gene expression profiles , 2007, BMC Bioinformatics.

[19]  David Botstein,et al.  Transcriptional response of steady-state yeast cultures to transient perturbations in carbon source. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[20]  Arlindo L. Oliveira,et al.  A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series , 2009, Algorithms for Molecular Biology.

[21]  Alan Wee-Chung Liew,et al.  Identification of coherent patterns in gene expression data using an efficient biclustering algorithm and parallel coordinate visualization , 2008, BMC Bioinformatics.

[22]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[23]  Eytan Ruppin,et al.  Nucleotide variation of regulatory motifs may lead to distinct expression patterns , 2007, ISMB/ECCB.

[24]  P. Bourgine,et al.  Topological and causal structure of the yeast transcriptional regulatory network , 2002, Nature Genetics.

[25]  Haseong Kim,et al.  Clustering of change patterns using Fourier coefficients , 2008, Bioinform..

[26]  R. Yeh,et al.  Differentially expressed genes are marked by histone 3 lysine 9 trimethylation in human cancer cells , 2008, Oncogene.

[27]  Gilles Caraux,et al.  PermutMatrix: a graphical environment to arrange gene expression profiles in optimal linear order , 2005, Bioinform..

[28]  Joana P Gonçalves,et al.  BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data , 2009, BMC Research Notes.

[29]  Alvis Brazma,et al.  Current approaches to gene regulatory network modelling , 2007, BMC Bioinformatics.

[30]  Alexandre P. Francisco,et al.  YEASTRACT-DISCOVERER: new tools to improve the analysis of transcriptional regulatory associations in Saccharomyces cerevisiae , 2007, Nucleic Acids Res..

[31]  Wim Van Criekinge,et al.  Defining a chromatin pattern that characterizes DNA-hypermethylated genes in colon cancer cells. , 2008, Cancer research.

[32]  M. Orešič,et al.  Pathways to the analysis of microarray data. , 2005, Trends in biotechnology.

[33]  Grace S. Shieh,et al.  A pattern recognition approach to infer time-lagged genetic interactions , 2008, Bioinform..

[34]  Stefan R. Henz,et al.  A gene expression map of Arabidopsis thaliana development , 2005, Nature Genetics.

[35]  D. Pe’er,et al.  Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data , 2003, Nature Genetics.

[36]  Jinyan Li,et al.  Maximal Biclique Subgraphs and Closed Pattern Pairs of the Adjacency Matrix: A One-to-One Correspondence and Mining Algorithms , 2007, IEEE Transactions on Knowledge and Data Engineering.

[37]  Arlindo L. Oliveira,et al.  Biclustering algorithms for biological data analysis: a survey , 2004, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[38]  Haidong Wang,et al.  Discovering molecular pathways from protein interaction and gene expression data , 2003, ISMB.

[39]  Dov J. Stekel,et al.  Strong negative self regulation of Prokaryotic transcription factors increases the intrinsic noise of protein expression , 2008, BMC Systems Biology.

[40]  S. Rafii,et al.  Splitting vessels: Keeping lymph apart from blood , 2003, Nature Medicine.