Minimum Message Length in Hybrid ARMA and LSTM Model Forecasting

Modeling and analysis of time series are important in applications including economics, engineering, environmental science and social science. Selecting the best time series model with accurate parameters in forecasting is a challenging objective for scientists and academic researchers. Hybrid models combining neural networks and traditional Autoregressive Moving Average (ARMA) models are being used to improve the accuracy of modeling and forecasting time series. Most of the existing time series models are selected by information-theoretic approaches, such as AIC, BIC, and HQ. This paper revisits a model selection technique based on Minimum Message Length (MML) and investigates its use in hybrid time series analysis. MML is a Bayesian information-theoretic approach and has been used in selecting the best ARMA model. We utilize the long short-term memory (LSTM) approach to construct a hybrid ARMA-LSTM model and show that MML performs better than AIC, BIC, and HQ in selecting the model—both in the traditional ARMA models (without LSTM) and with hybrid ARMA-LSTM models. These results held on simulated data and both real-world datasets that we considered.We also develop a simple MML ARIMA model.

[1]  Hector Perez-Meana,et al.  Forecasting of COVID19 per regions using ARIMA models and polynomial functions , 2020, Applied Soft Computing.

[2]  David L. Dowe,et al.  Minimum message length and generalized Bayesian nets with asymmetric languages , 2005 .

[3]  S. Peiris,et al.  A General Frequency Domain Estimation Method for Gegenbauer Processes , 2020 .

[4]  David L. Dowe,et al.  Database Normalization as a By-product of Minimum Message Length Inference , 2010, Australasian Conference on Artificial Intelligence.

[5]  David L. Dowe,et al.  General Bayesian networks and asymmetric languages , 2003 .

[6]  M. Kohler Wallace CS: Statistical and inductive inference by minimum message length , 2006 .

[7]  David L. Dowe,et al.  Minimum Message Length Autoregressive Moving Average Model Order Selection , 2021, ArXiv.

[8]  Jiti Gao,et al.  Modelling long-range-dependent Gaussian processes with application in continuous-time financial models , 2004, Journal of Applied Probability.

[9]  Donald B. Keim,et al.  Predicting returns in the stock and bond markets , 1986 .

[10]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[11]  H. Akaike A new look at the statistical model identification , 1974 .

[12]  E. Makalic,et al.  Minimum message length inference of the Poisson and geometric models using heavy-tailed prior distributions , 2017, 1708.02742.

[13]  Daniel Francis Schmidt.,et al.  Minimum message length inference of autoregressive moving average models , 2021 .

[14]  David L. Dowe,et al.  Intrinsic classification by MML - the Snob program , 1994 .

[15]  David L. Dowe,et al.  Introduction to Ray Solomonoff 85th Memorial Conference , 2011, Algorithmic Probability and Friends.

[16]  Le Zhang,et al.  Ensemble deep learning for regression and time series forecasting , 2014, 2014 IEEE Symposium on Computational Intelligence in Ensemble Learning (CIEL).

[17]  Ahmet Murat Ozbayoglu,et al.  Financial Time Series Forecasting with Deep Learning : A Systematic Literature Review: 2005-2019 , 2019, Appl. Soft Comput..

[18]  O. Linton,et al.  Nonparametric Predictive Regressions for Stock Return Prediction , 2019 .

[19]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[20]  Guokun Lai,et al.  Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , 2017, SIGIR.

[21]  Akbar Siami Namin,et al.  A Comparison of ARIMA and LSTM in Forecasting Time Series , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[22]  Antonio Aznar Grasa Econometric Model Selection: A New Approach , 1989 .

[23]  C. S. Wallace,et al.  Estimation and Inference by Compact Coding , 1987 .

[24]  Junjie Wu,et al.  Sentiment-aware stock market prediction: A deep learning method , 2017, 2017 International Conference on Service Systems and Service Management.

[25]  C. S. Wallace,et al.  Circular clustering of protein dihedral angles by Minimum Message Length. , 1996, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[26]  Daniel F. Schmidt,et al.  Minimum Message Length Order Selection and Parameter Estimation of Moving Average Models , 2011, Algorithmic Probability and Friends.

[27]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[28]  Poom Kumam,et al.  Fractional Neuro-Sequential ARFIMA-LSTM for Financial Market Forecasting , 2020, IEEE Access.

[29]  Leigh J. Fitzgibbon,et al.  Minimum message length autoregressive model order selection , 2004, International Conference on Intelligent Sensing and Information Processing, 2004. Proceedings of.

[30]  David L. Dowe,et al.  MML, hybrid Bayesian network graphical models, statistical consistency, invarianc , 2010 .

[31]  David L. Dowe,et al.  MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions , 2000, Stat. Comput..

[32]  D. Dowe,et al.  Minimum message length moving average time series data mining , 2005, 2005 ICSC Congress on Computational Intelligence Methods and Applications.

[33]  Chulwoo Han,et al.  Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies , 2017, Expert Syst. Appl..

[34]  David L. Dowe,et al.  MML Inference of Decision Graphs with Multi-way Joins and Dynamic Attributes , 2003, Australian Conference on Artificial Intelligence.

[35]  Ying Tan,et al.  Deep Stock Ranker: A LSTM Neural Network Model for Stock Selection , 2018, DMBD.

[36]  E. Fama,et al.  Dividend yields and expected stock returns , 1988 .

[37]  Lloyd Allison,et al.  Univariate Polynomial Inference by Monte Carlo Message Length Approximation , 2002, ICML.

[38]  C. S. Wallace,et al.  An Information Measure for Classification , 1968, Comput. J..

[39]  Jian Qi Wang,et al.  LSTM based long-term energy consumption prediction with periodicity , 2020 .

[40]  Philip S. Yu,et al.  A Comprehensive Survey on Graph Neural Networks , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[41]  D. Dowe,et al.  Model selection in linear regression using the MML criterion , 1994, DCC 1994.

[42]  Tommaso Proietti,et al.  Fractionally Differenced Gegenbauer Processes with Long Memory: A Review , 2018, Statistical Science.

[43]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[44]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[45]  Ken Aho,et al.  Model selection for ecologists: the worldviews of AIC and BIC. , 2014, Ecology.

[46]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[47]  Ray J. Solomonoff,et al.  Complexity-based induction systems: Comparisons and convergence theorems , 1978, IEEE Trans. Inf. Theory.

[48]  Rob J Hyndman,et al.  25 years of time series forecasting , 2006 .

[49]  David L. Dowe,et al.  Foreword re C. S. Wallace , 2008, Comput. J..

[50]  Kai Chen,et al.  A LSTM-based method for stock returns prediction: A case study of China stock market , 2015, 2015 IEEE International Conference on Big Data (Big Data).