Classification of gene functions using support vector machine for time-course gene expression data

Since most biological systems are developmental and dynamic, time-course gene expression profiles provide an important characterization of gene functions. Assigning functions for genes with unknown functions based on time-course gene expressions is an important task in functional genomics. Recently, various methods have been proposed for the classification of gene functions based on time-course gene expression data. In this paper, we consider the classification of gene functions from functional data analysis viewpoint, where a functional support vector machine is adopted. The functional support vector machine can model temporal effects of time-course gene expression data by incorporating the coefficients as well as the basis matrix obtained from a finite expansion of gene expressions on a set of basis functions. We apply the functional support vector machine to both real microarray and simulated data. Our results indicate that the functional support vector machine is effective in discriminating gene functions of time-course gene expressions with predefined functions. The method also provides valuable functional information about interactions between genes and allows the assignment of new functions to genes with unknown functions.

[1]  T Hwa,et al.  Expression patterns of cell-type-specific genes in Dictyostelium. , 2001, Molecular biology of the cell.

[2]  M Ohba,et al.  Modulation of intracellular protein degradation by SSB1–SIS1 chaperon system in yeast S. cerevisiae , 1997, FEBS letters.

[3]  Ho-Jin Lee,et al.  Functional data analysis: classification and regression , 2005 .

[4]  Ruben H. Zamar,et al.  Comparing the shapes of regression functions , 2000 .

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[6]  Hongzhe Li,et al.  Clustering of time-course gene expression data using a mixed-effects model with B-splines , 2003, Bioinform..

[7]  Fabrice Rossi,et al.  Support Vector Machine For Functional Data Classification , 2006, ESANN.

[8]  Florentina Bunea,et al.  Functional classification in Hilbert spaces , 2005, IEEE Transactions on Information Theory.

[9]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[10]  F. Rossi,et al.  Classification in Hilbert Spaces with Support Vector Machines , 2005 .

[11]  Thaddeus Tarpey,et al.  Clustering Functional Data , 2003, J. Classif..

[12]  Ashish Sood,et al.  Performing Hypothesis Tests on the Shape of Functional Data , 2006, Comput. Stat. Data Anal..

[13]  Hans-Georg Müller,et al.  Classification using functional data analysis for temporal gene expression data , 2006, Bioinform..

[14]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[15]  Ziv Bar-Joseph,et al.  Clustering short time series gene expression data , 2005, ISMB.

[16]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[17]  Dmitrij Frishman,et al.  MIPS: a database for genomes and protein sequences , 1999, Nucleic Acids Res..

[18]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[19]  Padhraic Smyth,et al.  Gene Expression Clustering with Functional Mixture Models , 2003, NIPS.

[20]  O. John Semmes,et al.  Functional Clustering Algorithm for High-Dimensional Proteomics Data , 2005, Journal of biomedicine & biotechnology.

[21]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.