Biclustering of ARMA time series

Biclustering is a method of grouping objects and attributes simultaneously in order to find multiple hidden patterns. When dealing with a long time series, there is a low possibility of finding meaningful clusters of whole time sequence. However, we may find more significant clusters containing partial time sequence by applying a biclustering method. This paper proposed a new biclustering algorithm for time series data following an autoregressive moving average (ARMA) model. We assumed the plaid model but modified the algorithm to incorporate the sequential nature of time series data. The maximum likelihood estimation (MLE) method was used to estimate coefficients of ARMA in each bicluster. We applied the proposed method to several synthetic data which were generated from different ARMA orders. Results from the experiments showed that the proposed method compares favorably with other biclustering methods for time series data.

[1]  Wojtek J. Krzanowski,et al.  Improved biclustering of microarray data demonstrated through systematic performance tests , 2005, Comput. Stat. Data Anal..

[2]  Chi-Hyuck Jun,et al.  A Biclustering Method for Time Series Analysis , 2010 .

[3]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[4]  Arlindo L. Oliveira,et al.  A Linear Time Biclustering Algorithm for Time Series Gene Expression Data , 2005, WABI.

[5]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[6]  Dit-Yan Yeung,et al.  Time series clustering with ARMA mixtures , 2004, Pattern Recognit..

[7]  William W. S. Wei,et al.  Time series analysis - univariate and multivariate methods , 1989 .

[8]  Konstantinos Kalpakis,et al.  Distance measures for effective clustering of ARIMA time-series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  George M. Church,et al.  Biclustering of Expression Data , 2000, ISMB.

[10]  Roberto Therón,et al.  Methods to Bicluster Validation and Comparison in Microarray Data , 2007, IDEAL.

[11]  L. Lazzeroni Plaid models for gene expression data , 2000 .

[12]  K. Tan,et al.  Finding Time-Lagged 3D Clusters , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[13]  Mohammed J. Zaki,et al.  TRICLUSTER: an effective algorithm for mining coherent clusters in 3D microarray data , 2005, SIGMOD '05.

[14]  Ya Zhang,et al.  A time-series biclustering algorithm for revealing co-regulated genes , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[15]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.