Identifying functional relationships within sets of co-expressed genes by combining upstream regulatory motif analysis and gene expression information

BackgroundExisting clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.ResultsWe applied our method to several sets of co-expressed genes and were able to define subsets with enrichment in particular biological processes and specific upstream regulatory motifs.ConclusionsThese results show the potential of our technique for functional prediction and regulatory motif identification from microarray data.

[1]  R. J. Cho,et al.  Candidate regulatory sequence elements for cell cycle-dependent transcription in Saccharomyces cerevisiae. , 1999, Genome research.

[2]  R. Altman,et al.  Whole-genome expression analysis: challenges beyond clustering. , 2001, Current opinion in structural biology.

[3]  D. Botstein,et al.  Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF , 2001, Nature.

[4]  Roger E Bumgarner,et al.  From co-expression to co-regulation: how many microarray experiments do we need? , 2004, Genome Biology.

[5]  Robert H. Gross,et al.  A novel ensemble learning method for de novo computational identification of DNA binding sites , 2007, BMC Bioinformatics.

[6]  Kevin W. Boyack,et al.  Domain visualization using VxInsight® for science and technology management , 2002, J. Assoc. Inf. Sci. Technol..

[7]  Lee Bardwell,et al.  A signaling mucin at the head of the Cdc42- and MAPK-dependent filamentous growth pathway in yeast. , 2004, Genes & development.

[8]  Frederick R. Cross,et al.  High Functional Overlap Between MluI Cell-Cycle Box Binding Factor and Swi4/6 Cell-Cycle Box Binding Factor in the G1/S Transcriptional Program in Saccharomyces cerevisiae , 2005, Genetics.

[9]  K. Nasmyth,et al.  A role for the transcription factors Mbp1 and Swi4 in progression from G1 to S phase. , 1993, Science.

[10]  Heather J. Ruskin,et al.  Techniques for clustering gene expression data , 2008, Comput. Biol. Medicine.

[11]  I. Taylor,et al.  Characterization of the DNA-binding domains from the yeast cell-cycle transcription factors Mbp1 and Swi4. , 2000, Biochemistry.

[12]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[13]  I. Jonassen,et al.  Predicting gene regulatory elements in silico on a genomic scale. , 1998, Genome research.

[14]  Atul J. Butte,et al.  Quantifying the relationship between co-expression, co-regulation and gene function , 2004, BMC Bioinformatics.

[15]  Martha L. Bulyk,et al.  UniPROBE: an online database of protein binding microarray data on protein–DNA interactions , 2008, Nucleic Acids Res..

[16]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[17]  William Stafford Noble,et al.  Assessing computational tools for the discovery of transcription factor binding sites , 2005, Nature Biotechnology.

[18]  Mona Singh,et al.  A combinatorial optimization approach for diverse motif finding applications , 2006, Algorithms for Molecular Biology.

[19]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.