Predicting gene ontology biological process from temporal gene expression patterns.

The aim of the present study was to generate hypotheses on the involvement of uncharacterized genes in biological processes. To this end, supervised learning was used to analyze microarray-derived time-series gene expression data. Our method was objectively evaluated on known genes using cross-validation and provided high-precision Gene Ontology biological process classifications for 211 of the 213 uncharacterized genes in the data set used. In addition, new roles in biological process were hypothesized for known genes. Our method uses biological knowledge expressed by Gene Ontology and generates a rule model associating this knowledge with minimal characteristic features of temporal gene expression profiles. This model allows learning and classification of multiple biological process roles for each gene and can predict participation of genes in a biological process even though the genes of this class exhibit a wide variety of gene expression profiles including inverse coregulation. A considerable number of the hypothesized new roles for known genes were confirmed by literature search. In addition, many biological process roles hypothesized for uncharacterized genes were found to agree with assumptions based on homology information. To our knowledge, a gene classifier of similar scope and functionality has not been reported earlier.

[1]  Hagit Shatkay,et al.  Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis , 2000, ISMB.

[2]  Andrzej Skowron,et al.  Rough Sets: A Tutorial , 1998 .

[3]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[4]  P. Halushka,et al.  Thromboxane A2 receptors. , 1995, Journal of lipid mediators and cell signalling.

[5]  P. Hardy,et al.  Oxidants, nitric oxide and prostanoids in the developing ocular vasculature: a basis for ischemic retinopathy. , 2000, Cardiovascular research.

[6]  Jan Komorowski,et al.  Predicting Gene Function from Gene Expressions and Ontologies , 2000, Pacific Symposium on Biocomputing.

[7]  H. Saya,et al.  A human homolog of Drosophila lethal(3)malignant brain tumor (l(3)mbt) protein associates with condensed mitotic chromosomes , 1999, Oncogene.

[8]  Lani F. Wu,et al.  Large-scale prediction of Saccharomyces cerevisiae gene function using overlapping transcriptional clusters , 2002, Nature Genetics.

[9]  L. Stanton,et al.  Altered patterns of gene expression in response to myocardial infarction. , 2000, Circulation research.

[10]  M. Yanagida,et al.  Higher order chromosome structure is affected by cold-sensitive mutations in a Schizosaccharomyces pombe gene crm1+ which encodes a 115- kD protein preferentially localized in the nucleus and its periphery , 1989, The Journal of cell biology.

[11]  Z. Pawlak,et al.  Rough sets perspective on data and knowledge , 2002 .

[12]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[13]  Ronald W. Davis,et al.  Transcriptional regulation and function during the human cell cycle , 2001, Nature Genetics.

[14]  D. Wolgemuth,et al.  The developmentally restricted pattern of expression in the male germ line of a murine cyclin A, cyclin A2, suggests roles in both mitotic and meiotic cell cycles. , 1996, Developmental biology.

[15]  G. Schroepfer,et al.  Oxysterols: modulators of cholesterol metabolism and other processes. , 2000, Physiological reviews.

[16]  C. Liu,et al.  Characterization of a novel 350-kilodalton nuclear phosphoprotein that is specifically involved in mitotic-phase progression , 1995, Molecular and cellular biology.

[17]  Jan Komorowski,et al.  Classification of Gene Expression Data in an Ontology , 2001, ISMDA.

[18]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[19]  F.S.T. Sweet,et al.  The template , 1985 .

[20]  S. Davis,et al.  Metallothionein expression in animals: a physiological perspective on function. , 2000, The Journal of nutrition.

[21]  James Joseph Biundo,et al.  Analysis of Contingency Tables , 1969 .

[22]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[23]  Pavel Brazdil,et al.  Proceedings of the European Conference on Machine Learning , 1993 .

[24]  R. Kobayashi,et al.  pl9 skp1 and p45 skp2 are essential elements of the cyclin A-CDK2 S phase kinase , 1995, Cell.

[25]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.