Robust detection of periodic patterns in gene expression microarray data using topological signal analysis

In this paper, we present a new approach for analyzing gene expression data that builds on topological characteristics of time series. Our goal is to identify cell cycle regulated genes in micro array dataset. We construct a point cloud out of time series using delay coordinate embeddings. Persistent homology is utilized to analyse the topology of the point cloud for detection of periodicity. This novel technique is accurate and robust to noise, missing data points and varying sampling intervals. Our experiments using Yeast Saccharomyces cerevisiae dataset substantiate the capabilities of the proposed method.

[1]  James P. Crutchfield,et al.  Geometry from a Time Series , 1980 .

[2]  Anders Berglund,et al.  A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription , 2003, Bioinform..

[3]  Korbinian Strimmer,et al.  Identifying periodically expressed transcripts in microarray time series data , 2008, Bioinform..

[4]  Mikael Vejdemo-Johansson,et al.  javaPlex: A Research Software Package for Persistent (Co)Homology , 2014, ICMS.

[5]  Ronald K. Pearson,et al.  BMC Bioinformatics BioMed Central Methodology article , 2005 .

[6]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[7]  Hamid Krim,et al.  Persistent Homology of Delay Embeddings and its Application to Wheeze Detection , 2014, IEEE Signal Processing Letters.

[8]  Bruce Randall Donald,et al.  A maximum entropy algorithm for rhythmic analysis of genome-wide expression patterns , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[9]  F. Takens Detecting strange attractors in turbulence , 1981 .

[10]  Hong Yan,et al.  Reliable detection of short periodic gene expression time series profiles in DNA microarray data , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[11]  Hongzhe Li,et al.  Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data , 2004, Bioinform..

[12]  J. D. Farmer,et al.  State space reconstruction in the presence of noise" Physica D , 1991 .

[13]  Zhaohui S. Qin,et al.  Statistical resynchronization and Bayesian detection of periodically expressed genes. , 2004, Nucleic acids research.

[14]  Gunnar E. Carlsson,et al.  Topology and data , 2009 .