The variational hierarchical EM algorithm for clustering hidden Markov models

In this paper, we derive a novel algorithm to cluster hidden Markov models (HMMs) according to their probability distributions. We propose a variational hierarchical EM algorithm that i) clusters a given collection of HMMs into groups of HMMs that are similar, in terms of the distributions they represent, and ii) characterizes each group by a "cluster center", i.e., a novel HMM that is representative for the group. We illustrate the benefits of the proposed algorithm on hierarchical clustering of motion capture sequences as well as on automatic music tagging.

[1]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[2]  L. R. Rabiner,et al.  A probabilistic distance measure for hidden Markov models , 1985, AT&T Technical Journal.

[3]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[5]  Lawrence Carin,et al.  Music Analysis Using Hidden Markov Mixture Models , 2007, IEEE Transactions on Signal Processing.

[6]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[8]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[9]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[10]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[11]  Tony Jebara,et al.  Spectral Clustering and Embedding with Hidden Markov Models , 2007, ECML.

[12]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[13]  Antoni B. Chan,et al.  Clustering dynamic textures with the hierarchical EM algorithm , 2013, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Padhraic Smyth,et al.  Clustering Sequences with Hidden Markov Models , 1996, NIPS.

[15]  Nuno Vasconcelos,et al.  Learning Mixture Hierarchies , 1998, NIPS.

[16]  E. Batlle,et al.  Automatic Song Identification in Noisy Broadcast Audio , 2002 .

[17]  Antoni B. Chan,et al.  Time Series Models for Semantic Music Annotation , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  John R. Hershey,et al.  Variational Kullback-Leibler divergence for Hidden Markov models , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).