Prediction of hierarchical time series using structured regularization and its application to artificial neural networks

This paper discusses the prediction of hierarchical time series, where each upper-level time series is calculated by summing appropriate lower-level time series. Forecasts for such hierarchical time series should be coherent, meaning that the forecast for an upper-level time series equals the sum of forecasts for corresponding lower-level time series. Previous methods for making coherent forecasts consist of two phases: first computing base (incoherent) forecasts and then reconciling those forecasts based on their inherent hierarchical structure. To improve time series predictions, we propose a structured regularization method for completing both phases simultaneously. The proposed method is based on a prediction model for bottom-level time series and uses a structured regularization term to incorporate upper-level forecasts into the prediction model. We also develop a backpropagation algorithm specialized for applying our method to artificial neural networks for time series prediction. Experimental results using synthetic and real-world datasets demonstrate that our method is comparable in terms of prediction accuracy and computational efficiency to other methods for time series prediction.

[1]  R. Lippmann,et al.  An introduction to computing with neural nets , 1987, IEEE ASSP Magazine.

[2]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[3]  Jairo Cugliari,et al.  Game-theoretically Optimal Reconciliation of Contemporaneous Hierarchical Time Series Forecasts , 2015 .

[4]  Ryuhei Miyashiro,et al.  Best subset selection via cross-validation criterion , 2020 .

[5]  P. Zhao,et al.  The composite absolute penalties family for grouped and hierarchical variable selection , 2009, 0909.0411.

[6]  Ken Kobayashi,et al.  BEST SUBSET SELECTION FOR ELIMINATING MULTICOLLINEARITY , 2017 .

[7]  James Bailey,et al.  Topology-regularized universal vector autoregression for traffic forecasting in large urban areas , 2017, Expert Syst. Appl..

[8]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[9]  K. U.,et al.  Variational Bayesian inference for forecasting hierarchical time series , 2014 .

[10]  Takanobu Nakahara,et al.  Investigating consumers’ store-choice behavior via hierarchical variable selection , 2017, Adv. Data Anal. Classif..

[11]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[12]  Rob J. Hyndman,et al.  Coherent Probabilistic Forecasts for Hierarchical Time Series , 2017, ICML.

[13]  Mehdi Khashei,et al.  An artificial neural network (p, d, q) model for timeseries forecasting , 2010, Expert Syst. Appl..

[14]  Rob J. Hyndman,et al.  Fast computation of reconciled forecasts for hierarchical and grouped time series , 2016, Comput. Stat. Data Anal..

[15]  George Athanasopoulos,et al.  Hierarchical forecasts for Australian domestic tourism , 2009 .

[16]  Kota KUDO,et al.  Stochastic Discrete First-Order Algorithm for Feature Subset Selection , 2020, IEICE Trans. Inf. Syst..

[17]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[18]  Enno Siemsen,et al.  The Sum and Its Parts: Judgmental Hierarchical Forecasting , 2016 .

[19]  R. Tibshirani,et al.  A LASSO FOR HIERARCHICAL INTERACTIONS. , 2012, Annals of statistics.

[20]  Gene Fliedner,et al.  An investigation of aggregate variable time series forecast strategies with specific subaggregate time series statistical correlation , 1999, Comput. Oper. Res..

[21]  Jean-Philippe Vert,et al.  Clustered Multi-Task Learning: A Convex Formulation , 2008, NIPS.

[22]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[23]  Helmut Lütkepohl,et al.  Forecasting Aggregated Time Series Variables: A Survey , 2011 .

[24]  Francis R. Bach,et al.  Structured Variable Selection with Sparsity-Inducing Norms , 2009, J. Mach. Learn. Res..

[25]  Jiafan Yu,et al.  Regularization in Hierarchical Time Series Forecasting with Application to Electricity Smart Meter Data , 2017, AAAI.

[26]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[27]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[28]  George Athanasopoulos,et al.  Forecast reconciliation: A geometric view with new insights on bias correction , 2021, International Journal of Forecasting.

[29]  Rob J. Hyndman,et al.  Optimal Forecast Reconciliation for Hierarchical and Grouped Time Series Through Trace Minimization , 2018, Journal of the American Statistical Association.

[30]  S. Karsoliya,et al.  Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture , 2012 .

[31]  Ken Kobayashi,et al.  Mixed integer quadratic optimization formulations for eliminating multicollinearity based on variance inflation factor , 2018, Journal of Global Optimization.

[32]  William W. Hsieh,et al.  Nonlinear multivariate and time series analysis by neural network methods , 2004 .

[33]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[34]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[35]  Jun Gao,et al.  Multivariate time series prediction of lane changing behavior using deep neural network , 2018, Applied Intelligence.

[36]  Sebastián Maldonado,et al.  Hierarchical time series forecasting via Support Vector Regression in the European Travel Retail Industry , 2019, Expert Syst. Appl..

[37]  Rajesh Piplani,et al.  Forecasting aggregate demand: An analytical evaluation of top-down versus bottom-up forecasting in a production planning framework , 2009 .

[38]  Guoqiang Peter Zhang,et al.  Neural network forecasting for seasonal and trend time series , 2005, Eur. J. Oper. Res..

[39]  D. Bertsimas,et al.  Best Subset Selection via a Modern Optimization Lens , 2015, 1507.03133.

[40]  Dimitris Bertsimas,et al.  Sparse Regression: Scalable Algorithms and Empirical Performance , 2019, Statistical Science.

[41]  Rob J. Hyndman,et al.  A note on the validity of cross-validation for evaluating autoregressive time series prediction , 2018, Comput. Stat. Data Anal..

[42]  Souhaib Ben Taieb,et al.  Regularized Regression for Hierarchical Forecasting Without Unbiasedness Conditions , 2019, KDD.

[43]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[44]  Rob J. Hyndman,et al.  Optimal combination forecasts for hierarchical time series , 2011, Comput. Stat. Data Anal..

[45]  T. Hastie,et al.  Learning Interactions via Hierarchical Group-Lasso Regularization , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[46]  William B. Nicholson,et al.  VARX-L: Structured Regularization for Large Vector Autoregressions with Exogenous Variables , 2015, 1508.07497.

[47]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[48]  Yu Zhang,et al.  A Survey on Multi-Task Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[49]  Carlos Capistrán,et al.  Multi-horizon inflation forecasts using disaggregated data , 2010 .

[50]  Yiran Chen,et al.  Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.