Bag-of-words representation for biomedical time series classification

Abstract Automatic analysis of biomedical time series such as electroencephalogram (EEG) and electrocardiographic (ECG) signals has attracted great interest in the community of biomedical engineering due to its important applications in medicine. In this work, a simple yet effective bag-of-words representation that is originally developed for text document analysis is extended for biomedical time series representation. In particular, similar to the bag-of-words model used in text document domain, the proposed method treats a time series as a text document and extracts local segments from the time series as words. The biomedical time series is then represented as a histogram of codewords, each entry of which is the count of a codeword appeared in the time series. Although the temporal order of the local segments is ignored, the bag-of-words representation is able to capture high-level structural information because both local and global structural information are well utilized. The performance of the bag-of-words model is validated on three datasets extracted from real EEG and ECG signals. The experimental results demonstrate that the proposed method is not only insensitive to parameters of the bag-of-words model such as local segment length and codebook size, but also robust to noise.

[1]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[2]  D. Hatzinakos,et al.  ECG Biometric Recognition Without Fiducial Detection , 2006, 2006 Biometrics Symposium: Special Session on Research at the Biometric Consortium Conference.

[3]  Dimitrios Hatzinakos,et al.  Analysis of Human Electrocardiogram for Biometric Recognition , 2008, EURASIP J. Adv. Signal Process..

[4]  Dominik Endres,et al.  A new metric for probability distributions , 2003, IEEE Transactions on Information Theory.

[5]  Yuan-Pin Lin,et al.  EEG-Based Emotion Recognition in Music Listening , 2010, IEEE Transactions on Biomedical Engineering.

[6]  Yi Mao,et al.  The Locally Weighted Bag of Words Framework for Document Representation , 2007, J. Mach. Learn. Res..

[7]  Marc Sebban,et al.  Supervised learning of Gaussian mixture models for visual vocabulary generation , 2012, Pattern Recognit..

[8]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[9]  Hasan Ocak,et al.  Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy , 2009, Expert Syst. Appl..

[10]  Ataollah Ebrahimzadeh,et al.  Classification of the electrocardiogram signals using supervised classifiers and efficient features , 2010, Comput. Methods Programs Biomed..

[11]  Weidong Zhou,et al.  Epileptic EEG classification based on extreme learning machine and nonlinear features , 2011, Epilepsy Research.

[12]  Clemens Elster,et al.  Verification of humans using the electrocardiogram , 2007, Pattern Recognit. Lett..

[13]  Xiao Hu,et al.  Intracranial hypertension prediction using extremely randomized decision trees. , 2012, Medical engineering & physics.

[14]  Haixian Wang,et al.  Local discriminative spatial patterns for movement-related potentials-based EEG classification , 2011, Biomed. Signal Process. Control..

[15]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[16]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[17]  U. Rajendra Acharya,et al.  Entropies for detection of epilepsy in EEG , 2005, Comput. Methods Programs Biomed..

[18]  W. Todd Scruggs,et al.  eigenPulse: Robust human identification from cardiovascular function , 2008, Pattern Recognit..

[19]  Xiao Hu,et al.  Random Subwindows for Robust Peak Recognition in Intracranial Pressure Signals , 2008, ISVC.

[20]  Brenda K. Wiederhold,et al.  ECG to identify individuals , 2005, Pattern Recognit..

[21]  Lei Yang,et al.  ECG identification based on Matching Pursuit , 2011, 2011 4th International Conference on Biomedical Engineering and Informatics (BMEI).

[22]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[23]  Hsiao-Lung Chan,et al.  Human identification by quantifying similarity and dissimilarity in electrocardiogram phase space , 2009, Pattern Recognit..

[24]  K Lehnertz,et al.  Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. , 2001, Physical review. E, Statistical, nonlinear, and soft matter physics.

[25]  Ahmad Reza Naghsh-Nilchi,et al.  Epilepsy seizure detection using eigen-system spectral estimation and Multiple Layer Perceptron neural network , 2010, Biomed. Signal Process. Control..

[26]  Moncef Gabbouj,et al.  A Generic and Robust System for Automated Patient-Specific Classification of ECG Signals , 2009, IEEE Transactions on Biomedical Engineering.

[27]  George Manis,et al.  Heartbeat Time Series Classification With Support Vector Machines , 2009, IEEE Transactions on Information Technology in Biomedicine.

[28]  Haixian Wang Multiclass Filters by a Weighted Pairwise Criterion for EEG Single-Trial Classification , 2011, IEEE Transactions on Biomedical Engineering.

[29]  Elif Derya Übeyli,et al.  Multiclass Support Vector Machines for EEG-Signals Classification , 2007, IEEE Transactions on Information Technology in Biomedicine.

[30]  Alberto O. Mendelzon,et al.  Querying Time Series Data Based on Similarity , 2000, IEEE Trans. Knowl. Data Eng..

[31]  Kemal Polat,et al.  Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform , 2007, Appl. Math. Comput..

[32]  Adrian D. C. Chan,et al.  Wavelet Distance Measure for Person Identification Using Electrocardiograms , 2008, IEEE Transactions on Instrumentation and Measurement.

[33]  Madhuchhanda Mitra,et al.  Increasing the accuracy of ECG based biometric analysis by data modelling , 2012 .

[34]  Haiping Lu,et al.  Regularized Common Spatial Pattern With Aggregation for EEG Classification in Small-Sample Setting , 2010, IEEE Transactions on Biomedical Engineering.

[35]  Mubarak Shah,et al.  Learning human actions via information maximization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[37]  U. Rajendra Acharya,et al.  Author's Personal Copy Biomedical Signal Processing and Control Automated Diagnosis of Epileptic Eeg Using Entropies , 2022 .

[38]  Elif Derya Übeyli,et al.  Multiclass Support Vector Machines for EEG-Signals Classification , 2007, IEEE Trans. Inf. Technol. Biomed..

[39]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[40]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[41]  Daniel Rivero,et al.  Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks , 2010, Journal of Neuroscience Methods.

[42]  Ola Pettersson,et al.  ECG analysis: a new approach in human identification , 2001, IEEE Trans. Instrum. Meas..

[43]  Dennis J. McFarland,et al.  Brain–computer interfaces for communication and control , 2002, Clinical Neurophysiology.

[44]  ForecastingSandy D. BalkinPennsylvania Using Recurrent Neural Networks for Time Series , 1997 .

[45]  Elif Derya Übeyli,et al.  ECG beat classifier designed by combined neural network model , 2005, Pattern Recognit..

[46]  Xiao Hu,et al.  Semi-supervised detection of intracranial pressure alarms using waveform dynamics , 2013, Physiological measurement.

[47]  Peter Stagge,et al.  Recurrent neural networks for time series classification , 2003, Neurocomputing.

[48]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[49]  Yuan Li,et al.  Rotation-invariant similarity in time series using bag-of-patterns representation , 2012, Journal of Intelligent Information Systems.

[50]  Chong Jin Ong,et al.  A Feature Selection Method for Multilevel Mental Fatigue EEG Classification , 2007, IEEE Transactions on Biomedical Engineering.