CinC Challenge: Cluster analysis of multi-granular time-series data for mortality rate prediction

The goal of this research is to develop novel cluster analysis techniques to identify similarity between ICU time-series data. The results generated by cluster analysis are further used for ICU mortality prediction. To preprocess multi-granular ICU time-series, we proposed a segmentation-based method to divide time-series into several segments. The minimal and maximal values within each segment were captured to maintain the statistical feature of the segment. A weighted Euclidean distance function was in place to evaluate the similarity between two instances and clustering was later used to convert each time-series into a corresponding cluster number. This way, we turned the high dimensional ICU time series data into a 2-dimensional matrix. A rule-based classification model was developed from this 2-dimensional matrix, and the model was used to predict the in-hospital mortality for test cases. The experiments show that above approach is effective in handling ICU time-series data.

[1]  Dale E. Seborg,et al.  Pattern Matching in Multivariate Time Series Databases Using a Moving-Window Approach , 2002 .

[2]  Luis M. Camarinha-Matos,et al.  Integration and learning in supervision of flexible assembly systems , 1996, IEEE Trans. Robotics Autom..

[3]  Padhraic Smyth,et al.  Deformable Markov model templates for time-series pattern matching , 2000, KDD '00.

[4]  Wang Wei Research of data mining method on multivariate time series , 2006 .

[5]  Eugene Fink,et al.  Search for Patterns in Compressed Time Series , 2002, Int. J. Image Graph..

[6]  Qingshan Jiang,et al.  Pattern Matching Method Based on Point Distribution for Multivariate Time Series: Pattern Matching Method Based on Point Distribution for Multivariate Time Series , 2009 .

[7]  Jing Liu,et al.  Multivariate time series prediction via temporal classification , 2002, Proceedings 18th International Conference on Data Engineering.

[8]  Konstantinos Kalpakis,et al.  Distance measures for effective clustering of ARIMA time-series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  Zheng-ou Wang,et al.  Research on Shape-Based Time Series Similarity Measure , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[10]  Wang Wei,et al.  A Time-Sequence Similarity Matching Algorithm for Seismological Relevant Zones , 2006 .

[11]  Guan He Pattern Matching Method Based on Point Distribution for Multivariate Time Series , 2009 .

[12]  Dale E. Seborg,et al.  Matching patterns from historical data using PCA and distance similarity factors , 2001, Proceedings of the 2001 American Control Conference. (Cat. No.01CH37148).

[13]  W. Krzanowski Between-Groups Comparison of Principal Components , 1979 .