DTW-D: time series semi-supervised learning from a single example

Classification of time series data is an important problem with applications in virtually every scientific endeavor. The large research community working on time series classification has typically used the UCR Archive to test their algorithms. In this work we argue that the availability of this resource has isolated much of the research community from the following reality, labeled time series data is often very difficult to obtain. The obvious solution to this problem is the application of semi-supervised learning; however, as we shall show, direct applications of off-the-shelf semi-supervised learning algorithms do not typically work well for time series. In this work we explain why semi-supervised learning algorithms typically fail for time series problems, and we introduce a simple but very effective fix. We demonstrate our ideas on diverse real word problems.

[1]  Ralph Grishman,et al.  Semi-supervised Semantic Pattern Discovery with Guidance from Unsupervised Pattern Clusters , 2010, COLING.

[2]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[3]  Xiaoli Li,et al.  Learning to Classify Texts Using Positive and Unlabeled Data , 2003, IJCAI.

[4]  Lexiang Ye,et al.  Annotating Historical Archives of Images , 2010, Int. J. Digit. Libr. Syst..

[5]  Eamonn J. Keogh,et al.  Annotating historical archives of images , 2008, Int. J. Digit. Libr. Syst..

[6]  R. Manmatha,et al.  Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Marc'Aurelio Ranzato,et al.  Semi-supervised learning of compact document representations with deep networks , 2008, ICML '08.

[8]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[9]  Eamonn J. Keogh,et al.  Time Series Classification under More Realistic Assumptions , 2013, SDM.

[10]  Martial Hebert,et al.  Semi-Supervised Self-Training of Object Detection Models , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[11]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[12]  Li Wei,et al.  Semi-supervised time series classification , 2006, KDD '06.

[13]  Cordelia Schmid,et al.  Multimodal semi-supervised learning for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Stephen J. Roberts,et al.  Bayesian time series classification , 2001, NIPS.

[15]  Pavlos Protopapas,et al.  Discovering arbitrary event types in time series , 2009, Stat. Anal. Data Min..

[16]  M. N. Nguyen,et al.  pro-Positive Unlabeled Learning for Time Series Classification , 2022 .

[17]  Xiaojin Zhu,et al.  --1 CONTENTS , 2006 .

[18]  Eamonn J. Keogh,et al.  LB_Keogh supports exact indexing of shapes under rotation invariance with arbitrary representations and distance measures , 2006, VLDB.

[19]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[20]  Stefano Soatto,et al.  Flexible Dictionaries for Action Classification , 2008 .

[21]  Philip de Chazal,et al.  Automatic classification of heartbeats using ECG morphology and heartbeat interval features , 2004, IEEE Transactions on Biomedical Engineering.

[22]  Tim Oates,et al.  Visualization of multivariate time-series data in a neonatal ICU , 2012, IBM J. Res. Dev..

[23]  Dechawut Wanichsan,et al.  Stopping Criterion Selection for Efficient Semi-supervised Time Series Classification , 2008, Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[24]  Ujjwal Maulik,et al.  A self-trained ensemble with semisupervised SVM: An application to pixel classification of remote sensing imagery , 2011, Pattern Recognit..

[25]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[26]  Manuela M. Veloso,et al.  Non-Parametric Time Series Classification , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[27]  Yuhui Shi,et al.  Particle swarm optimization based semi-supervised learning on Chinese text categorization , 2012, 2012 IEEE Congress on Evolutionary Computation.

[28]  M. Borodovsky,et al.  Gene identification in novel eukaryotic genomes by self-training algorithm , 2005, Nucleic acids research.

[29]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[30]  Agenor Mafra-Neto,et al.  SIGKDD demo: sensors and software to allow computational entomology, an emerging application of data mining , 2011, KDD.

[31]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.