Principal components analysis based methodology to identify differentially expressed genes in time-course microarray data

BackgroundTime-course microarray experiments are being increasingly used to characterize dynamic biological processes. In these experiments, the goal is to identify genes differentially expressed in time-course data, measured between different biological conditions. These differentially expressed genes can reveal the changes in biological process due to the change in condition which is essential to understand differences in dynamics.ResultsIn this paper, we propose a novel method for finding differentially expressed genes in time-course data and across biological conditions (say C1 and C2). We model the expression at C1 using Principal Component Analysis and represent the expression profile of each gene as a linear combination of the dominant Principal Components (PCs). Then the expression data from C2 is projected on the developed PCA model and scores are extracted. The difference between the scores is evaluated using a hypothesis test to quantify the significance of differential expression. We evaluate the proposed method to understand differences in two case studies (1) the heat shock response of wild-type and HSF1 knockout mice, and (2) cell-cycle between wild-type and Fkh1/Fkh2 knockout Yeast strains.ConclusionIn both cases, the proposed method identified biologically significant genes.

[1]  Ivor J. Benjamin,et al.  Targeted Disruption of Heat Shock Transcription Factor 1 Abolishes Thermotolerance and Protection against Heat-inducible Apoptosis* , 1998, The Journal of Biological Chemistry.

[2]  Antonio Reverter,et al.  Simultaneous identification of differential gene expression and connectivity in inflammation, adipogenesis and cancer , 2006, Bioinform..

[3]  D. Slonim From patterns to pathways: gene expression data analysis comes of age , 2002, Nature Genetics.

[4]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Lukas Endler,et al.  Forkhead-like transcription factors recruit Ndd1 to the chromatin of G2/M-specific promoters , 2000, Nature.

[6]  M. Fielden,et al.  In Silico Approaches to Mechanistic and Predictive Toxicology: An Introduction to Bioinformatics for Toxicologists , 2002, Critical reviews in toxicology.

[7]  N. J. H. Small Plotting squared radii , 1978 .

[8]  Jennifer Y. King,et al.  Signature patterns of gene expression in mouse atherosclerosis and their correlation to human coronary disease. , 2005, Physiological genomics.

[9]  Youyong Zhu,et al.  Genetic diversity and disease control in rice , 2000, Nature.

[10]  D. Botstein,et al.  Two yeast forkhead genes regulate the cell cycle and pseudohyphal growth , 2000, Nature.

[11]  Taesung Park,et al.  Statistical tests for identifying differentially expressed genes in time-course microarray experiments , 2003, Bioinform..

[12]  Russ B. Altman,et al.  Nonparametric methods for identifying differentially expressed genes in microarray data , 2002, Bioinform..

[13]  Neal S. Holter,et al.  Fundamental patterns underlying gene expression profiles: simplicity from complexity. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[14]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Nicola J. Rinaldi,et al.  Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle , 2001, Cell.

[16]  J. Edward Jackson,et al.  A User's Guide to Principal Components: Jackson/User's Guide to Principal Components , 2004 .

[17]  Joshua M. Stuart,et al.  MICROARRAY EXPERIMENTS : APPLICATION TO SPORULATION TIME SERIES , 1999 .

[18]  K Nasmyth,et al.  Switching transcription on and off during the yeast cell cycle: Cln/Cdc28 kinases activate bound transcription factor SBF (Swi4/Swi6) at start, whereas Clb/Cdc28 kinases displace it from the promoter in G2. , 1996, Genes & development.

[19]  Tommi S. Jaakkola,et al.  Continuous Representations of Time-Series Gene Expression Data , 2003, J. Comput. Biol..

[20]  David Botstein,et al.  The role of heat shock transcription factor 1 in the genome-wide regulation of the mammalian heat shock response. , 2003, Molecular biology of the cell.

[21]  M. Bartlett TESTS OF SIGNIFICANCE IN FACTOR ANALYSIS , 1950 .

[22]  Kim Nasmyth,et al.  The role of SWI4 and SWI6 in the activity of G1 cyclins in yeast , 1991, Cell.

[23]  Fengzhu Sun,et al.  MARD: a new method to detect differential gene expression in treatment-control time courses , 2006, Bioinform..

[24]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[25]  Ana Conesa,et al.  Gene expression maSigPro : a method to identify significantly differential expression profiles in time-course microarray experiments , 2006 .

[26]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[27]  T. Jaakkola,et al.  Comparing the continuous representation of time-series expression profiles to identify differentially expressed genes , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  John D. Storey,et al.  A network-based analysis of systemic inflammation in humans , 2005, Nature.

[29]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[30]  John D. Storey,et al.  Significance analysis of time course microarray experiments. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[31]  J. E. Jackson A User's Guide to Principal Components , 1991 .

[32]  Xiaohui Liu,et al.  Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data , 2006, BMC Bioinformatics.

[33]  Wei Pan,et al.  A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments , 2002, Bioinform..

[34]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.