Benchmarking time series classification - Functional data vs machine learning approaches

Time series classification problems have drawn increasing attention in the machine learning and statistical community. Closely related is the field of functional data analysis (FDA): it refers to the range of problems that deal with the analysis of data that is continuously indexed over some domain. While often employing different methods, both fields strive to answer similar questions, a common example being classification or regression problems with functional covariates. We study methods from functional data analysis, such as functional generalized additive models, as well as functionality to concatenate (functional-) feature extraction or basis representations with traditional machine learning algorithms like support vector machines or classification trees. In order to assess the methods and implementations, we run a benchmark on a wide variety of representative (time series) data sets, with in-depth analysis of empirical results, and strive to provide a reference ranking for which method(s) to use for non-expert practitioners. Additionally, we provide a software framework in R for functional data analysis for supervised learning, including machine learning and more linear approaches from statistics. This allows convenient access, and in connection with the machine-learning toolbox mlr, those methods can now also be tuned and benchmarked.

[1]  Karen Fuchs,et al.  Penalized scalar-on-functions regression with interaction term , 2015, Comput. Stat. Data Anal..

[2]  Fabian Scheipl,et al.  A general framework for functional regression modelling , 2017 .

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Z. Q. John Lu,et al.  Nonparametric Functional Data Analysis: Theory And Practice , 2007, Technometrics.

[5]  Bernd Bischl,et al.  batchtools: Tools for R to work on batch systems , 2017, J. Open Source Softw..

[6]  Philipp H. Boersch-Supan rucrdtw: Fast time series subsequence search in R , 2016, J. Open Source Softw..

[7]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[8]  Jason Lines,et al.  Time-Series Classification with COTE: The Collective of Transformation-Based Ensembles , 2015, IEEE Transactions on Knowledge and Data Engineering.

[9]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[10]  Jason Lines,et al.  Classification of time series by shapelet transformation , 2013, Data Mining and Knowledge Discovery.

[11]  Bernd Bischl,et al.  mlr Tutorial , 2016, ArXiv.

[12]  Danna Zhou,et al.  d. , 1934, Microbial pathogenesis.

[13]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[14]  Sazali Yaacob,et al.  EEG feature extraction for classifying emotions using FCM and FKM , 2008 .

[15]  Toni Giorgino,et al.  Matching incomplete time series with dynamic time warping: an algorithm and an application to post-stroke rehabilitation , 2009, Artif. Intell. Medicine.

[16]  Skander Soltani,et al.  On the use of the wavelet decomposition for time series prediction , 2002, ESANN.

[17]  J. Cooley,et al.  The Fast Fourier Transform , 1975 .

[18]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[19]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[20]  Fabian Scheipl CRAN Task View: Functional Data Analysis , 2019 .

[21]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[22]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[23]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ciprian M. Crainiceanu,et al.  refund: Regression with Functional Data , 2013 .

[25]  Anuj Srivastava,et al.  Functional and Shape Data Analysis , 2016 .

[26]  Fabian Scheipl,et al.  Estimator selection and combination in scalar-on-function regression , 2014, Comput. Stat. Data Anal..

[27]  P. Kokoszka,et al.  Introduction to Functional Data Analysis , 2017 .

[28]  Bernd Bischl,et al.  Resampling Methods for Meta-Model Validation with Recommendations for Evolutionary Computation , 2012, Evolutionary Computation.

[29]  Bernd Bischl,et al.  BatchJobs and BatchExperiments: Abstraction Mechanisms for Using R in Batch Environments , 2015 .

[30]  Clemens Stachl,et al.  Show me how you Drive and I’ll Tell you who you are Recognizing Gender Using Automotive Driving Parameters☆ , 2015 .

[31]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[32]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[33]  Caroline F Finch,et al.  Applications of functional data analysis: A systematic review , 2013, BMC Medical Research Methodology.

[34]  Torsten Hothorn,et al.  Model-based Boosting 2.0 , 2010, J. Mach. Learn. Res..

[35]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[36]  Yuying Jiang,et al.  Driver Sleepiness Detection System Based on Eye Movements Variables , 2013 .