A hybrid computational method for the identification of cell cycle-regulated genes

Gene expression microarrays are the most commonly available source of high-throughput biological data. They have been widely employed in recent years for the definition of cell cycle regulated (or periodically expressed) subsets of the genome in a number of different organisms. These have driven the development of various computational methods for identifying periodical expressed genes. However, the agreement is remarkably poor when different computational methods are applied to the same data. In view of this, we are motivated to propose herein a hybrid computational method targeting the identification of periodically expressed genes, which is based on a hybrid aggregation of estimations, generated by different computational methods. The proposed hybrid method is benchmarked against three other computational methods for the identification of periodically expressed genes: statistical tests for regulation and periodicity and a combined test for regulation and periodicity. The hybrid method is shown, together with the combined test, to statistically significantly outperform the statistical test for periodicity. However, the hybrid method is also demonstrated to be significantly better than the combined test for regulation and periodicity.

[1]  Veselka Boeva,et al.  A Hybrid DTW Based Method for Integration Analysis of Time Series Data , 2009, 2009 International Conference on Adaptive and Intelligent Systems.

[2]  L. Hennig,et al.  Genome-wide gene expression in an Arabidopsis cell suspension , 2003, Plant Molecular Biology.

[3]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[4]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[5]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[6]  Veselka Boeva,et al.  Two-Pass Imputation Algorithm for Missing Value Estimation in gene Expression Time Series , 2007, J. Bioinform. Comput. Biol..

[7]  R. Iman,et al.  Approximations of the critical region of the fbietkan statistic , 1980 .

[8]  Veselka Boeva,et al.  Nonparametric recursive aggregation process , 2004, Kybernetika.

[9]  P. Lio’,et al.  Periodic gene expression program of the fission yeast cell cycle , 2004, Nature Genetics.

[10]  R. E. Lee,et al.  Distribution-free multiple comparisons between successive treatments , 1995 .

[11]  Veselka Boeva,et al.  Fusing time series expression data through hybrid aggregation and hierarchical merge , 2008, ECCB.

[12]  Peer Bork,et al.  Comparison of computational methods for the identification of cell cycle-regulated genes , 2005, Bioinform..

[13]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Data Mining Researchers , 2003 .

[14]  Elena Tsiporkova,et al.  Merging microarray cell synchronization experiments through curve alignment , 2007, Bioinform..

[15]  Adam P. Rosebrock,et al.  The Cell Cycle–Regulated Genes of Schizosaccharomyces pombe , 2005, PLoS biology.

[16]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[17]  Wilhelm Gruissem,et al.  Cell Cycle-regulated Gene Expression inArabidopsis * , 2002, The Journal of Biological Chemistry.

[18]  K. Shedden,et al.  Analysis of cell-cycle-specific gene expression in human cells as determined by microarrays and double-thymidine block synchronization , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[19]  L. Wong,et al.  Identification of cell cycle-regulated genes in fission yeast. , 2005, Molecular biology of the cell.

[20]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..