论文信息 - MultiETSC: automated machine learning for early time series classification

MultiETSC: automated machine learning for early time series classification

Early time series classification (EarlyTSC) involves the prediction of a class label based on partial observation of a given time series. Most EarlyTSC algorithms consider the trade-off between accuracy and earliness as two competing objectives, using a single dedicated hyperparameter. To obtain insights into this trade-off requires finding a set of non-dominated (Pareto efficient) classifiers. So far, this has been approached through manual hyperparameter tuning. Since the trade-off hyperparameters only provide indirect control over the earliness-accuracy trade-off, manual tuning is tedious and tends to result in many sub-optimal hyperparameter settings. This complicates the search for optimal hyperparameter settings and forms a hurdle for the application of EarlyTSC to real-world problems. To address these issues, we propose an automated approach to hyperparameter tuning and algorithm selection for EarlyTSC, building on developments in the fast-moving research area known as automated machine learning (AutoML). To deal with the challenging task of optimising two conflicting objectives in early time series classification, we propose MultiETSC, a system for multi-objective algorithm selection and hyperparameter optimisation (MO-CASH) for EarlyTSC. MultiETSC can potentially leverage any existing or future EarlyTSC algorithm and produces a set of Pareto optimal algorithm configurations from which a user can choose a posteriori. As an additional benefit, our proposed framework can incorporate and leverage time-series classification algorithms not originally designed for EarlyTSC for improving performance on EarlyTSC; we demonstrate this property using a newly defined, “naïve” fixed-time algorithm. In an extensive empirical evaluation of our new approach on a benchmark of 115 data sets, we show that MultiETSC performs substantially better than baseline methods, ranking highest (avg. rank 1.98) compared to conceptually simpler single-algorithm (2.98) and single-objective alternatives (4.36).

Holger H. Hoos | Mitra Baratchi | Gilles Ottervanger

[1] João Gama,et al. Self Hyper-Parameter Tuning for Data Streams , 2018, DS.

[2] Kevin Leyton-Brown,et al. Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[3] Guoliang He,et al. Confidence-based early classification of multivariate time series with multiple interpretable rules , 2019, Pattern Analysis and Applications.

[4] Hyrum S. Anderson,et al. Classifying with confidence from incomplete information , 2013, J. Mach. Learn. Res..

[5] Kevin Leyton-Brown,et al. An Efficient Approach for Assessing Hyperparameter Importance , 2014, ICML.

[6] Heike Trautmann,et al. MO-ParamILS: A Multi-objective Automatic Algorithm Configuration Framework , 2016, LION.

[7] Camelia Chira,et al. Classifiers with a reject option for early time-series classification , 2013, 2013 IEEE Symposium on Computational Intelligence and Ensemble Learning (CIEL).

[8] Sanjoy Dasgupta,et al. Early Classification of Time Series by Simultaneously Optimizing the Accuracy and Earliness , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[9] Qingquan Song,et al. Auto-Keras: An Efficient Neural Architecture Search System , 2018, KDD.

[10] Juan José Rodríguez Diez,et al. Boosting Interval-Based Literals: Variable Length and Early Classification , 2003 .

[11] Talal Rahwan,et al. Using the Shapley Value to Analyze Algorithm Portfolios , 2016, AAAI.

[12] Xu Bing,et al. AdaBoost typical Algorithm and its application research , 2017 .

[13] Daniel P. Morin,et al. Surface Electrocardiogram Predictors of Sudden Cardiac Arrest. , 2016, The Ochsner journal.

[14] Randal S. Olson,et al. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science , 2016, GECCO.

[15] Jean Bigeon,et al. Performance indicators in multiobjective optimization , 2018, Eur. J. Oper. Res..

[16] Emmanuel Ramasso,et al. A Deep Reinforcement Learning Approach for Early Classification of Time Series , 2018, 2018 26th European Signal Processing Conference (EUSIPCO).

[17] Kevin P. Murphy,et al. An experimental investigation of model-based parameter optimisation: SPO and beyond , 2009, GECCO.

[18] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[19] Hao Wang,et al. A Multicriteria Generalization of Bayesian Global Optimization , 2016, Advances in Stochastic and Deterministic Global Optimization.

[20] Marc Rußwurm,et al. End-to-end Learning for Early Classification of Time Series , 2019, ArXiv.

[21] Ulf Leser,et al. Fast and Accurate Time Series Classification with WEASEL , 2017, CIKM.

[22] Alexander Mendiburu,et al. Early classification of time series using multi-objective optimization techniques , 2019, Inf. Sci..

[23] Yan Xu,et al. Constrained Multi-Objective Optimization for Automated Machine Learning , 2019, 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[24] Yan Xu,et al. Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning , 2018, KDD.

[25] Antoine Cornuéjols,et al. Early Classification of Time Series as a Non Myopic Sequential Decision Making Problem , 2015, ECML/PKDD.

[26] Yoshua Bengio,et al. Random Search for Hyper-Parameter Optimization , 2012, J. Mach. Learn. Res..

[27] Philip S. Yu,et al. Early classification on time series , 2012, Knowledge and Information Systems.

[28] Lawrence Carin,et al. Earliness-Aware Deep Convolutional Networks for Early Time Series Classification , 2016, ArXiv.

[29] Junwei Lv,et al. An Effective Confidence-Based Early Classification of Time Series , 2019, IEEE Access.

[30] Philip S. Yu,et al. Extracting Interpretable Features for Early Classification on Time Series , 2011, SDM.

[31] Kevin Leyton-Brown,et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms , 2012, KDD.

[32] Elke A. Rundensteiner,et al. Adaptive-Halting Policy Network for Early Classification , 2019, KDD.

[33] Lothar Thiele,et al. Comparison of Multiobjective Evolutionary Algorithms: Empirical Results , 2000, Evolutionary Computation.

[34] Janez Demsar,et al. Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[35] Matthias Carnein,et al. confStream: Automated Algorithm Selection and Configuration of Stream Clustering Algorithms , 2020, LION.

[36] Aaron Klein,et al. Efficient and Robust Automated Machine Learning , 2015, NIPS.

[37] Eamonn J. Keogh,et al. Reliable early classification of time series based on discriminating the classes over time , 2016, Data Mining and Knowledge Discovery.

[38] Yoshua Bengio,et al. Algorithms for Hyper-Parameter Optimization , 2011, NIPS.

[39] Ulf Leser,et al. TEASER: early and accurate time series classification , 2019, Data Mining and Knowledge Discovery.

[40] Eamonn J. Keogh,et al. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[41] Robert E. Schapire,et al. A Brief Introduction to Boosting , 1999, IJCAI.