Multichannel biomedical time series clustering via hierarchical probabilistic latent semantic analysis

Biomedical time series clustering that automatically groups a collection of time series according to their internal similarity is of importance for medical record management and inspection such as bio-signals archiving and retrieval. In this paper, a novel framework that automatically groups a set of unlabelled multichannel biomedical time series according to their internal structural similarity is proposed. Specifically, we treat a multichannel biomedical time series as a document and extract local segments from the time series as words. We extend a topic model, i.e., the Hierarchical probabilistic Latent Semantic Analysis (H-pLSA), which was originally developed for visual motion analysis to cluster a set of unlabelled multichannel time series. The H-pLSA models each channel of the multichannel time series using a local pLSA in the first layer. The topics learned in the local pLSA are then fed to a global pLSA in the second layer to discover the categories of multichannel time series. Experiments on a dataset extracted from multichannel Electrocardiography (ECG) signals demonstrate that the proposed method performs better than previous state-of-the-art approaches and is relatively robust to the variations of parameters including length of local segments and dictionary size. Although the experimental evaluation used the multichannel ECG signals in a biometric scenario, the proposed algorithm is a universal framework for multichannel biomedical time series clustering according to their structural similarity, which has many applications in biomedical time series management.

[1]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[2]  Damjan Zazula,et al.  Detection of heartbeat and respiration from optical interferometric signal by using wavelet transform , 2013, Comput. Methods Programs Biomed..

[3]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[4]  Elisabeth André,et al.  Emotion recognition based on physiological changes in music listening , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Saeid Nahavandi,et al.  Biomedical time series clustering based on non-negative sparse coding and probabilistic topic model , 2013, Comput. Methods Programs Biomed..

[6]  Kenneth Steiglitz,et al.  Combinatorial Optimization: Algorithms and Complexity , 1981 .

[7]  Reza Rostami,et al.  Classifying depression patients and normal subjects using machine learning techniques , 2011, 2011 19th Iranian Conference on Electrical Engineering.

[8]  Shaogang Gong,et al.  Global Behaviour Inference using Probabilistic Latent Semantic Analysis , 2008, BMVC.

[9]  Jean Mercklé,et al.  ECG beat classification using a cost sensitive classifier , 2013, Comput. Methods Programs Biomed..

[10]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[11]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[12]  Edward Y. Chang,et al.  Parallel Spectral Clustering in Distributed Systems , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[14]  Saeid Nahavandi,et al.  Unsupervised mining of long time series based on latent topic model , 2013, Neurocomputing.

[15]  Bernhard Schölkopf,et al.  A Local Learning Approach for Clustering , 2006, NIPS.

[16]  Yuan Li,et al.  Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation , 2009, SSDBM.

[17]  U. Rajendra Acharya,et al.  ECG beat classification using PCA, LDA, ICA and Discrete Wavelet Transform , 2013, Biomed. Signal Process. Control..

[18]  George Manis,et al.  Heartbeat Time Series Classification With Support Vector Machines , 2009, IEEE Transactions on Information Technology in Biomedicine.

[19]  Germán Castellanos-Domínguez,et al.  Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering , 2012, Comput. Methods Programs Biomed..

[20]  Elif Derya Übeyli,et al.  ECG beat classifier designed by combined neural network model , 2005, Pattern Recognit..

[21]  Jemal H. Abawajy,et al.  Multistage approach for clustering and classification of ECG data , 2013, Comput. Methods Programs Biomed..

[22]  Saeid Nahavandi,et al.  Bag-of-words representation for biomedical time series classification , 2012, Biomed. Signal Process. Control..

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Jitendra Malik,et al.  Normalized Cuts and Image Segmentation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Jitendra Malik,et al.  Spectral grouping using the Nystrom method , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..