Clustering Hidden Markov Models With Variational Bayesian Hierarchical EM

The hidden Markov model (HMM) is a broadly applied generative model for representing time-series data, and clustering HMMs attract increased interest from machine learning researchers. However, the number of clusters (K) and the number of hidden states (S) for cluster centers are still difficult to determine. In this article, we propose a novel HMM-based clustering algorithm, the variational Bayesian hierarchical EM algorithm, which clusters HMMs through their densities and priors and simultaneously learns posteriors for the novel HMM cluster centers that compactly represent the structure of each cluster. The numbers K and S are automatically determined in two ways. First, we place a prior on the pair (K,S) and approximate their posterior probabilities, from which the values with the maximum posterior are selected. Second, some clusters and states are pruned out implicitly when no data samples are assigned to them, thereby leading to automatic selection of the model complexity. Experiments on synthetic and real data demonstrate that our algorithm performs better than using model selection techniques with maximum likelihood estimation.

[1]  Matthew Danielson Hidden Markov models with variational inference in marketing science , 2021 .

[2]  Antoni B. Chan,et al.  Understanding the collinear masking effect in visual search through eye tracking , 2021, Psychonomic Bulletin & Review.

[3]  Antoni B. Chan,et al.  Eye movement analysis with hidden Markov models (EMHMM) with co-clustering , 2021, Behavior Research Methods.

[4]  Antoni B. Chan,et al.  Do portrait artists have enhanced face processing abilities? Evidence from hidden Markov modeling of eye movements , 2021, Cognition.

[5]  Junwei Gao,et al.  Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Zhanshan Wang,et al.  Extended Dissipativity Analysis for Markovian Jump Neural Networks via Double-Integral-Based Delay-Product-Type Lyapunov Functional , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[7]  Zhanshan Wang,et al.  H∞ Performance State Estimation for Static Neural Networks With Time-Varying Delays via Two Improved Inequalities , 2020, IEEE Transactions on Circuits and Systems II: Express Briefs.

[8]  Hamamache Kheddouci,et al.  A Generative Time Series Clustering Framework Based on an Ensemble Mixture of HMMs , 2020, 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI).

[9]  Antoni B. Chan,et al.  The interrelation between interpretation biases, threat expectancies and pain‐related attentional processing , 2020, European journal of pain.

[10]  Antoni B. Chan,et al.  Understanding visual attention to face emotions in social anxiety using hidden Markov models , 2020, Cognition & emotion.

[11]  Antoni B. Chan,et al.  Interpretation biases and visual attention in the processing of ambiguous information in chronic pain , 2020, European journal of pain.

[12]  Jia Li,et al.  Aggregated Wasserstein Distance and State Registration for Hidden Markov Models , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Tim Chuk,et al.  Eye movement analysis with switching hidden Markov models , 2019, Behavior Research Methods.

[14]  Antoni B. Chan,et al.  Parametric Manifold Learning of Gaussian Mixture Models , 2019, IJCAI.

[15]  Antoni B. Chan,et al.  Individuals with insomnia misrecognize angry faces as fearful faces while missing the eyes: an eye-tracking study , 2018, Sleep.

[16]  Hedvig Kjellström,et al.  Advances in Variational Inference , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Jeehye An,et al.  Modulation of mood on eye movement pattern and performance in face recognition , 2019, CogSci.

[18]  Xiang Lin,et al.  FGCH: a fast and grid based clustering algorithm for hybrid data stream , 2018, Applied Intelligence.

[19]  Antoni B. Chan,et al.  EMHMM Simulation Study , 2018, ArXiv.

[20]  Xiang Lin,et al.  A novel image segmentation method based on fast density clustering algorithm , 2018, Eng. Appl. Artif. Intell..

[21]  Javad Sadri,et al.  Gene clustering with hidden Markov model optimized by PSO algorithm , 2018, Pattern Analysis and Applications.

[22]  Tim Chuk,et al.  Hidden Markov model analysis reveals the advantage of analytic eye movement patterns in face recognition across cultures , 2017, Cognition.

[23]  Tim Chuk,et al.  Is having similar eye movement patterns during face learning and recognition beneficial for recognition performance? Evidence from hidden Markov modeling , 2017, Vision Research.

[24]  Jinyin Chen,et al.  A novel cluster center fast determination clustering algorithm , 2017, Appl. Soft Comput..

[25]  Sean Hughes,et al.  Clustering by Fast Search and Find of Density Peaks , 2016 .

[26]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[27]  M. Cugmas,et al.  On comparing partitions , 2015 .

[28]  Tim Chuk,et al.  Understanding eye movements in face recognition using hidden Markov models. , 2014, Journal of vision.

[29]  Antoni B. Chan,et al.  Clustering hidden Markov models with variational HEM , 2012, J. Mach. Learn. Res..

[30]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[31]  Pierrick Bruneau,et al.  Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach , 2010, Pattern Recognit..

[32]  John R. Hershey,et al.  Variational Kullback-Leibler divergence for Hidden Markov models , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[33]  Tony Jebara,et al.  Spectral Clustering and Embedding with Hidden Markov Models , 2007, ECML.

[34]  D. M. Titterington,et al.  Variational approximations in Bayesian model selection for finite mixture distributions , 2007, Comput. Stat. Data Anal..

[35]  Lawrence Carin,et al.  Dirichlet Process HMM Mixture Models with Application to Music Analysis , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[36]  Marc Toussaint,et al.  Extracting Motion Primitives from Natural Handwriting Data , 2006, ICANN.

[37]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[38]  Richard J Boys,et al.  A Bayesian Approach to DNA Sequence Segmentation , 2004, Biometrics.

[39]  Matthew J. Beal,et al.  The variational Bayesian EM algorithm for incomplete data: with application to scoring graphical model structures , 2003 .

[40]  Vladimir Pavlovic,et al.  Discovering clusters in motion time-series data , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[41]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[42]  Naonori Ueda,et al.  Bayesian model search for mixture models based on optimizing variational bounds , 2002, Neural Networks.

[43]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[44]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[45]  Mark Sandler,et al.  Segmentation of Musical Signals Using Hidden Markov Models. , 2001 .

[46]  Jorma Rissanen,et al.  Hypothesis Selection and Testing by the MDL Principle , 1999, Comput. J..

[47]  Nuno Vasconcelos,et al.  Learning Mixture Hierarchies , 1998, NIPS.

[48]  Padhraic Smyth,et al.  Clustering Sequences with Hidden Markov Models , 1996, NIPS.

[49]  Steve R. Waterhouse,et al.  Bayesian Methods for Mixtures of Experts , 1995, NIPS.

[50]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[51]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[52]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[53]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[54]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[55]  H. Akaike A new look at the statistical model identification , 1974 .

[56]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .