Comparative Analysis of the Hidden Markov Model and LSTM: A Simulative Approach.

Time series and sequential data have gained significant attention recently since many real-world processes in various domains such as finance, education, biology, and engineering can be modeled as time series. Although many algorithms and methods such as the Kalman filter, hidden Markov model, and long short term memory (LSTM) are proposed to make inferences and predictions for the data, their usage significantly depends on the application, type of the problem, available data, and sufficient accuracy or loss. In this paper, we compare the supervised and unsupervised hidden Markov model to LSTM in terms of the amount of data needed for training, complexity, and forecasting accuracy. Moreover, we propose various techniques to discretize the observations and convert the problem to a discrete hidden Markov model under stationary and non-stationary situations. Our results indicate that even an unsupervised hidden Markov model can outperform LSTM when a massive amount of labeled data is not available. Furthermore, we show that the hidden Markov model can still be an effective method to process the sequence data even when the first-order Markov assumption is not satisfied.

[1]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  Kyoung-jae Kim,et al.  Financial time series forecasting using support vector machines , 2003, Neurocomputing.

[3]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[4]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[5]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[6]  Magy Seif El-Nasr,et al.  Modeling Individual Differences in Game Behavior Using HMM , 2018, AIIDE.

[7]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[8]  Shih-Fu Chang,et al.  Structure analysis of soccer video with hidden Markov models , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  Gregory J. Pottie,et al.  Predicting Student Performance in an Educational Game Using a Hidden Markov Model , 2019, IEEE Transactions on Education.

[11]  Mohammad Marufuzzaman,et al.  A framework for modeling and assessing system resilience using a Bayesian network: A case study of an interdependent electrical infrastructure system , 2019, Int. J. Crit. Infrastructure Prot..

[12]  Yoshua Bengio,et al.  An Input Output HMM Architecture , 1994, NIPS.

[13]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[14]  Guoqiang Peter Zhang,et al.  Time series forecasting using a hybrid ARIMA and neural network model , 2003, Neurocomputing.

[15]  Yumi Iwashita,et al.  Comprehensive Analysis of Time Series Forecasting Using Neural Networks , 2020, ArXiv.

[16]  Aderemi Oluyinka Adewumi,et al.  Stock Price Prediction Using the ARIMA Model , 2014, 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation.

[17]  Oladimeji Farri,et al.  Neural Paraphrase Generation with Stacked Residual LSTM Networks , 2016, COLING.

[18]  Yuhao Wang,et al.  Bayesian Network Modeling of Airport Runway Incursion Occurring Processes for Predictive Accident Control , 2019 .

[19]  Larry P. Heck,et al.  Contextual LSTM (CLSTM) models for Large scale NLP tasks , 2016, ArXiv.

[20]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[21]  Frank Jensen,et al.  Optimal junction Trees , 1994, UAI.

[22]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.