Classification of streaming time series under more realistic assumptions

Much of the vast literature on time series classification makes several assumptions about data and the algorithm’s eventual deployment that are almost certainly unwarranted. For example, many research efforts assume that the beginning and ending points of the pattern of interest can be correctly identified, during both the training phase and later deployment. Another example is the common assumption that queries will be made at a constant rate that is known ahead of time, thus computational resources can be exactly budgeted. In this work, we argue that these assumptions are unjustified, and this has in many cases led to unwarranted optimism about the performance of the proposed algorithms. As we shall show, the task of correctly extracting individual gait cycles, heartbeats, gestures, behaviors, etc., is generally much more difficult than the task of actually classifying those patterns. Likewise, gesture classification systems deployed on a device such as Google Glass may issue queries at frequencies that range over an order of magnitude, making it difficult to plan computational resources. We propose to mitigate these problems by introducing an alignment-free time series classification framework. The framework requires only very weakly annotated data, such as “in this ten minutes of data, we see mostly normal heartbeats$$\ldots $$…,” and by generalizing the classic machine learning idea of data editing to streaming/continuous data, allows us to build robust, fast and accurate anytime classifiers. We demonstrate on several diverse real-world problems that beyond removing unwarranted assumptions and requiring essentially no human intervention, our framework is both extremely fast and significantly more accurate than current state-of-the-art approaches.

[1]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[2]  Philip de Chazal,et al.  Automatic classification of heartbeats using ECG morphology and heartbeat interval features , 2004, IEEE Transactions on Biomedical Engineering.

[3]  Thomas Seidl,et al.  Harnessing the strengths of anytime algorithms for constant data streams , 2009, Data Mining and Knowledge Discovery.

[4]  M. Arif,et al.  Complexity analysis of stride interval time series by threshold dependent symbolic entropy , 2006, European Journal of Applied Physiology.

[5]  Eamonn J. Keogh,et al.  Towards never-ending learning from time series streams , 2013, KDD.

[6]  Dah-Jye Lee,et al.  Anytime Classification Using the Nearest Neighbor Algorithm with Applications to Stream Mining , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7]  Shlomo Zilberstein,et al.  Anytime algorithm development tools , 1996, SGAR.

[8]  Allen Y. Yang,et al.  Distributed recognition of human actions using wearable motion sensor networks , 2009, J. Ambient Intell. Smart Environ..

[9]  Patrick Olivier,et al.  A Dynamic Time Warping Approach to Real-Time Activity Recognition for Food Preparation , 2010, AmI.

[10]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.

[11]  Yi Zhang,et al.  Training Conditional Random Fields Using Transfer Learning for Gesture Recognition , 2010, 2010 IEEE International Conference on Data Mining.

[12]  Wolfgang Konen,et al.  Gesture recognition on few training data using Slow Feature Analysis and parametric bootstrap , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).

[13]  Ling Bao,et al.  Activity Recognition from User-Annotated Acceleration Data , 2004, Pervasive.

[14]  Didier Stricker,et al.  Introducing a modular activity monitoring system , 2011, 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[15]  Eamonn J. Keogh,et al.  A Complexity-Invariant Distance Measure for Time Series , 2011, SDM.

[16]  Agenor Mafra-Neto,et al.  SIGKDD demo: sensors and software to allow computational entomology, an emerging application of data mining , 2011, KDD.

[17]  Dimitrios Gunopulos,et al.  Indexing Large Human-Motion Databases , 2004, VLDB.

[18]  Eamonn J. Keogh,et al.  Data Editing Techniques to Allow the Application of Distance-Based Outlier Detection to Streams , 2010, 2010 IEEE International Conference on Data Mining.

[19]  G. Cavagna,et al.  Mechanical work in terrestrial locomotion: two basic mechanisms for minimizing energy expenditure. , 1977, The American journal of physiology.

[20]  Kirsi Helkala,et al.  Biometric Gait Authentication Using Accelerometer Sensor , 2006, J. Comput..

[21]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[22]  Hlaing Minn,et al.  A Patient-Adaptive Profiling Scheme for ECG Beat Classification , 2010, IEEE Transactions on Information Technology in Biomedicine.

[23]  Eamonn J. Keogh,et al.  Classification of Multi-dimensional Streaming Time Series by Weighting Each Classifier's Track Record , 2013, 2013 IEEE 13th International Conference on Data Mining.

[24]  Blake Hannaford,et al.  A Hybrid Discriminative/Generative Approach for Modeling Human Activities , 2005, IJCAI.

[25]  Daijin Kim,et al.  Simultaneous Gesture Segmentation and Recognition based on Forward Spotting Accumulative HMMs , 2006, ICPR.

[26]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[27]  Matti Pietikäinen,et al.  Machine Learning for Vision-Based Motion Analysis , 2011 .

[28]  Darko Kirovski,et al.  Real-time classification of dance gestures from skeleton animation , 2011, SCA '11.

[29]  Johannes Peltola,et al.  Activity classification using realistic data from wearable sensors , 2006, IEEE Transactions on Information Technology in Biomedicine.

[30]  Lei Chen,et al.  Robust and fast similarity search for moving object trajectories , 2005, SIGMOD '05.

[31]  Eamonn J. Keogh,et al.  Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL , 2011, 2011 IEEE 11th International Conference on Data Mining.

[32]  Einar Snekkenes,et al.  Towards understanding the uniqueness of gait biometric , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[33]  Katarzyna Stapor,et al.  A hybrid discriminative/generative approach to protein fold recognition , 2012, Neurocomputing.

[34]  Eamonn J. Keogh,et al.  Polishing the Right Apple: Anytime Classification Also Benefits Data Streams with Constant Arrival Times , 2010, 2010 IEEE International Conference on Data Mining.

[35]  Shlomo Zilberstein,et al.  Approximate Reasoning Using Anytime Algorithms , 1995 .

[36]  Jignesh M. Patel,et al.  An efficient and accurate method for evaluating time series similarity , 2007, SIGMOD '07.

[37]  Lei Li,et al.  Time Series Clustering: Complex is Simpler! , 2011, ICML.

[38]  Radu-Daniel Vatavu,et al.  The effect of sampling rate on the performance of template-based gesture recognizers , 2011, ICMI '11.

[39]  Eamonn J. Keogh,et al.  Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data , 2011, 2011 IEEE 11th International Conference on Data Mining.

[40]  S.L. Gonzalez Andino,et al.  Measuring the complexity of time series: An application to neurophysiological signals , 2000, Human brain mapping.

[41]  Zhihua Wang,et al.  An adaptive real-time method for fetal heart rate extraction based on phonocardiography , 2012, 2012 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[42]  T. McMahon,et al.  The mechanics of running: how does stiffness couple with speed? , 1990, Journal of biomechanics.

[43]  Eamonn J. Keogh,et al.  Time Series Classification under More Realistic Assumptions , 2013, SDM.

[44]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[45]  Monica N. Nicolescu,et al.  RECOGNIZING SIMPLE HUMAN ACTIONS USING 3D HEAD MOVEMENT , 2007, Comput. Intell..

[46]  Eamonn J. Keogh,et al.  Autocannibalistic and Anyspace Indexing Algorithms with Application to Sensor Data Mining , 2009, SDM.

[47]  Maïté Brandt-Pearce,et al.  Neural Network Gait Classification for On-Body Inertial Sensors , 2009, 2009 Sixth International Workshop on Wearable and Implantable Body Sensor Networks.

[48]  Mi Zhang,et al.  USC-HAD: a daily activity dataset for ubiquitous activity recognition using wearable sensors , 2012, UbiComp.

[49]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.