Clustering of time-course gene expression data using functional data analysis

Clustering of gene expression data collected across time is receiving growing attention in the biological literature since time-course experiments allow one to understand dynamic biological processes and identify genes governed by the same processes. It is believed that genes demonstrating similar expression profiles over time might give an informative insight into how underlying biological mechanisms work. In this paper, we propose a method based on functional data analysis (FNDA) to cluster time-dependent gene expression profiles. Consideration of clustering problems using the FNDA setting provides ways to take time dependency into account by using basis function expansion to describe the partially observed curves. We also discuss how to choose the number of bases in the basis function expansion in FNDA. A synthetic cycle data and a real data are used to demonstrate the proposed method and some comparisons between the proposed and existing approaches using the adjusted Rand indices are made.

[1]  Alexander Schliep,et al.  Using hidden Markov models to analyze gene expression time course data , 2003, ISMB.

[2]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[3]  James O. Ramsay,et al.  Applied Functional Data Analysis: Methods and Case Studies , 2002 .

[4]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[5]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[6]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[8]  Martin Alexander Youngson,et al.  Linear Functional Analysis , 2000 .

[9]  Anders Berglund,et al.  A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription , 2003, Bioinform..

[10]  Olaf Wolkenhauer,et al.  A fully Bayesian model to cluster gene-expression profiles , 2005, ECCB/JBI.

[11]  Hongzhe Li,et al.  Clustering of time-course gene expression data using a mixed-effects model with B-splines , 2003, Bioinform..

[12]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[13]  Shyamal D. Peddada,et al.  Gene Selection and Clustering for Time-course and Dose-response Microarray Experiments Using Order-restricted Inference , 2003, Bioinform..

[14]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[15]  Ka Yee Yeung,et al.  Principal component analysis for clustering gene expression data , 2001, Bioinform..