Autowarp: Learning a Warping Distance from Unlabeled Time Series Using Sequence Autoencoders

Measuring similarities between unlabeled time series trajectories is an important problem in many domains such as medicine, economics, and vision. It is often unclear what is the appropriate metric to use because of the complex nature of noise in the trajectories (e.g. different sampling rates or outliers). Experts typically hand-craft or manually select a specific metric, such as Dynamic Time Warping (DTW), to apply on their data. In this paper, we propose an end-to-end framework, autowarp, that optimizes and learns a good metric given unlabeled trajectories. We define a flexible and differentiable family of warping metrics, which encompasses common metrics such as DTW, Edit Distance, Euclidean, etc. Autowarp then leverages the representation power of sequence autoencoders to optimize for a member of this warping family. The output is an metric which is easy to interpret and can be robustly learned from relatively few trajectories. In systematic experiments across different domains, we show that autowarp often outperforms hand-crafted trajectory similarity metrics.

[1]  Hongsheng Yin,et al.  Generalized Framework for Similarity Measure of Time Series , 2014 .

[2]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[3]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[4]  Laurens van der Maaten,et al.  Modeling Time Series Similarity with Siamese Recurrent Networks , 2016, ArXiv.

[5]  Eamonn J. Keogh,et al.  Finding Unusual Medical Time-Series Subsequences: Algorithms and Applications , 2006, IEEE Transactions on Information Technology in Biomedicine.

[6]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[7]  Lovekesh Vig,et al.  TimeNet: Pre-trained deep recurrent neural network for time series classification , 2017, ESANN.

[8]  Lei Chen,et al.  On The Marriage of Lp-norms and Edit Distance , 2004, VLDB.

[9]  Mohammed J. Zaki Data Mining and Analysis: Fundamental Concepts and Algorithms , 2014 .

[10]  J. Scargle Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data , 1982 .

[11]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[12]  Juan Pardo,et al.  Stacked Denoising Auto-Encoders for Short-Term Time Series Forecasting , 2015 .

[13]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[14]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[15]  J. K. Kearney,et al.  Stream Editing for Animation , 1990 .

[16]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[17]  Jenna Wiens,et al.  Clinically Meaningful Comparisons Over Time: An Approach to Measuring Patient Similarity based on Subsequence Alignment , 2018, ArXiv.

[18]  Shazia Wasim Sadiq,et al.  An Effectiveness Study on Trajectory Similarity Measures , 2013, ADC.

[19]  Jean-Michel Loubes,et al.  Review and Perspective for Distance Based Trajectory Clustering , 2015, ArXiv.

[20]  Pierre-François Marteau,et al.  Time Warp Edit Distance with Stiffness Adjustment for Time Series Matching , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[22]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[23]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[24]  Quoc V. Le,et al.  Semi-supervised Sequence Learning , 2015, NIPS.

[25]  Nitish Srivastava,et al.  Unsupervised Learning of Video Representations using LSTMs , 2015, ICML.

[26]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[27]  Tim Oates,et al.  Time series classification from scratch with deep neural networks: A strong baseline , 2016, 2017 International Joint Conference on Neural Networks (IJCNN).

[28]  Dominique T. Shipmon,et al.  Time Series Anomaly Detection; Detection of anomalous drops with limited features and sparse examples in noisy highly periodic data , 2017, ArXiv.