Bayesian detection of periodic mRNA time profiles without use of training examples

BackgroundDetection of periodically expressed genes from microarray data without use of known periodic and non-periodic training examples is an important problem, e.g. for identifying genes regulated by the cell-cycle in poorly characterised organisms. Commonly the investigator is only interested in genes expressed at a particular frequency that characterizes the process under study but this frequency is seldom exactly known. Previously proposed detector designs require access to labelled training examples and do not allow systematic incorporation of diffuse prior knowledge available about the period time.ResultsA learning-free Bayesian detector that does not rely on labelled training examples and allows incorporation of prior knowledge about the period time is introduced. It is shown to outperform two recently proposed alternative learning-free detectors on simulated data generated with models that are different from the one used for detector design. Results from applying the detector to mRNA expression time profiles from S. cerevisiae showsthat the genes detected as periodically expressed only contain a small fraction of the cell-cycle genes inferred from mutant phenotype. For example, when the probability of false alarm was equal to 7%, only 12% of the cell-cycle genes were detected. The genes detected as periodically expressed were found to have a statistically significant overrepresentation of known cell-cycle regulated sequence motifs. One known sequence motif and 18 putative motifs, previously not associated with periodic expression, were also over represented.ConclusionIn comparison with recently proposed alternative learning-free detectors for periodic gene expression, Bayesian inference allows systematic incorporation of diffuse a priori knowledge about, e.g. the period time. This results in relative performance improvements due to increased robustness against errors in the underlying assumptions. Results from applying the detector to mRNA expression time profiles from S. cerevisiae include several new findings that deserve further experimental studies.

[1]  R. Tjian,et al.  Orchestrated response: a symphony of transcription factors for gene control. , 2000, Genes & development.

[2]  Kerby Shedden,et al.  Analysis of cell-cycle gene expression in Saccharomyces cerevisiae using microarrays and multiple synchronization methods , 2002, Nucleic Acids Res..

[3]  G. Church,et al.  Identifying regulatory networks by combinatorial analysis of promoter elements , 2001, Nature Genetics.

[4]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[5]  Neal S. Holter,et al.  Fundamental patterns underlying gene expression profiles: simplicity from complexity. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[6]  Peer Bork,et al.  Comparison of computational methods for the identification of cell cycle-regulated genes , 2005, Bioinform..

[7]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[8]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[9]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[10]  Korbinian Strimmer,et al.  Identifying periodically expressed transcripts in microarray time series data , 2008, Bioinform..

[11]  S. Moreno,et al.  Regulation of CDK/cyclin complexes during the cell cycle. , 1997, The international journal of biochemistry & cell biology.

[12]  Anders Berglund,et al.  A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription , 2003, Bioinform..

[13]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[14]  L. P. Zhao,et al.  Statistical modeling of large microarray data sets to identify stimulus-response profiles , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[15]  H. Hartley,et al.  Tests of significance in harmonic analysis. , 1949, Biometrika.

[16]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[17]  Neal S. Holter,et al.  Dynamic modeling of gene expression data. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Marvin H. J. Guber Bayesian Spectrum Analysis and Parameter Estimation , 1988 .

[19]  L. Breeden,et al.  Periodic Transcription: A Cycle within a Cycle , 2003, Current Biology.

[20]  Thomas L. Marzetta,et al.  Detection, Estimation, and Modulation Theory , 1976 .

[21]  Hongzhe Li,et al.  Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data , 2004, Bioinform..

[22]  G. Church,et al.  Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. , 2000, Journal of molecular biology.

[23]  Zhaohui S. Qin,et al.  Statistical resynchronization and Bayesian detection of periodically expressed genes. , 2004, Nucleic acids research.

[24]  R. J. Cho,et al.  Candidate regulatory sequence elements for cell cycle-dependent transcription in Saccharomyces cerevisiae. , 1999, Genome research.