Learning Complex Uncertain States Changes via Asymmetric Hidden Markov Models: an Industrial Case

In many problems involving multivariate time series, Hidden Markov Models (HMMs) are often employed to model complex behavior over time. HMMs can, however, require large number of states, that can lead to overfitting issues especially when limited data is available. In this work, we propose a family of models called Asymmetric Hidden Markov Models (HMM-As), that generalize the emission distributions to arbitrary Bayesian-network distributions. The new model allows for state-specific graphical structures defined over the space of observable features, what renders more compact state spaces and hence a better handling of the complexity-overfitting trade-off. We first define asymmetric HMMs, followed by the definition of a learning procedure inspired on the structural expectation-maximization framework allowing for decomposing learning per state. Then, we relate representation aspects of HMM-As to standard and independent HMMs. The last contribution of the paper is a set of experiments that elucidate the behavior of asymmetric HMMs on practical scenarios, including simulations and industry-based scenarios. The empirical results indicate that HMMs are limited when learning structured distributions, what is prevented by the more parsimonious representation of HMM-As. Furthermore, HMM-As showed to be promising in uncovering multiple graphical structures and providing better model fit in a case study from the domain of large-scale printers, thus providing additional problem insight.

[1]  Jukka Corander,et al.  The role of local partial independence in learning of Bayesian networks , 2016, Int. J. Approx. Reason..

[2]  Jianwu Dang,et al.  Integration of articulatory and spectrum features based on the hybrid HMM/BN modeling framework , 2006, Speech Commun..

[3]  A. B. Poritz,et al.  Linear predictive hidden Markov models and the speech signal , 1982, ICASSP.

[4]  Jean-Baptiste Denis,et al.  Bayesian Networks , 2014 .

[5]  Peter J. F. Lucas,et al.  Understanding disease processes by partitioned dynamic Bayesian networks , 2016, J. Biomed. Informatics.

[6]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[7]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[8]  Jeff A. Bilmes,et al.  Dynamic Bayesian Multinets , 2000, UAI.

[9]  Zoubin Ghahramani,et al.  An Introduction to Hidden Markov Models and Bayesian Networks , 2001, Int. J. Pattern Recognit. Artif. Intell..

[10]  van der Wmp Wil Aalst,et al.  Evaluating the quality of discovered process models , 2008 .

[11]  Nir Friedman,et al.  Learning Bayesian Networks with Local Structure , 1996, UAI.

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Luc De Raedt,et al.  Exploiting local and repeated structure in Dynamic Bayesian Networks , 2016, Artif. Intell..

[14]  Ralf Möller,et al.  Indirect Causes in Dynamic Bayesian Networks Revisited , 2015, IJCAI.

[15]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[16]  Padhraic Smyth,et al.  Conditional Chow-Liu Tree Structures for Modeling Discrete-Valued Vector Time Series , 2004, UAI.

[17]  Guy Melançon,et al.  Generating connected acyclic digraphs uniformly at random , 2004, Inf. Process. Lett..

[18]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[19]  Michael I. Jordan,et al.  Factorial Hidden Markov Models , 1995, Machine Learning.

[20]  Jeff A. Bilmes,et al.  What HMMs Can Do , 2006, IEICE Trans. Inf. Syst..

[21]  David Heckerman,et al.  Knowledge Representation and Inference in Similarity Networks and Bayesian Multinets , 1996, Artif. Intell..

[22]  Adam Prügel-Bennett,et al.  Evolving the structure of hidden Markov models , 2006, IEEE Transactions on Evolutionary Computation.

[23]  Dirk Husmeier,et al.  Non-homogeneous dynamic Bayesian networks with Bayesian regularization for inferring gene regulatory networks with gradually time-varying structure , 2012, Machine Learning.