Hidden Markov Model Clustering of Acoustic Data

This dissertation explores methods for cluster analysis of acoustic data. Techniques developed are applied primarily to whale song, but the task is treated in as general a manner as possible. Three algorithms are presented, all built around hidden Markov models, respectively implementing partitional, agglomerative, and divisive clustering. Topology optimization through Bayesian model selection is explored, addressing the issues of the number of clusters present and the model complexity required to model each cluster, but available methods are found to be unreliable for complex data. A number of feature extraction procedures are examined, and their relative merits compared for various types of data. Overall, hierarchical HMM clustering is found to be an effective tool for unsupervised learning of sound patterns.

[1]  Bin Yu,et al.  Model Selection and the Principle of Minimum Description Length , 2001 .

[2]  A. Izenman,et al.  Fourier Analysis of Time Series: An Introduction , 1977, IEEE Transactions on Systems, Man and Cybernetics.

[3]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6]  D. Haussler,et al.  Hidden Markov models in computational biology. Applications to protein modeling. , 1993, Journal of molecular biology.

[7]  C. Robert,et al.  Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method , 2000 .

[8]  James T. Kwok,et al.  Rival penalized competitive learning for model-based sequence clustering , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[9]  A. Nejat Ince,et al.  Digital Speech Processing , 1992 .

[10]  Joydeep Ghosh,et al.  Probabilistic model-based clustering of complex data , 2003 .

[11]  S. Datta,et al.  Dolphin whistle classification for determining group identities , 2002, Signal Process..

[12]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[13]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[14]  Padhraic Smyth,et al.  Clustering Sequences with Hidden Markov Models , 1996, NIPS.

[15]  Paul R. Cohen,et al.  Using Dynamic Time Warping to Bootstrap HMM-Based Clustering of Time Series , 2001, Sequence Learning.

[16]  Gilles Celeux,et al.  On Stochastic Versions of the EM Algorithm , 1995 .

[17]  Cen Li,et al.  Applying the Hidden Markov Model Methodology for Unsupervised Learning of Temporal Data , 2002 .

[18]  Byron Dom,et al.  An Information-Theoretic External Cluster-Validity Measure , 2002, UAI.

[19]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20]  Gautam Biswas,et al.  A Bayesian Approach to Temporal Data Clustering using Hidden Markov Models , 2000, ICML.

[21]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[22]  H. White,et al.  Information criteria for selecting possibly misspecified parametric models , 1996 .

[23]  Nikos Fakotakis,et al.  Spectral and cepstral projection bases constructed by independent component analysis , 2000, INTERSPEECH.

[24]  Sergio M. Savaresi,et al.  Cluster Selection in Divisive Clustering Algorithms , 2002, SDM.

[25]  Adrian E. Raftery,et al.  How Many Clusters? Which Clustering Method? Answers Via Model-Based Cluster Analysis , 1998, Comput. J..

[26]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[27]  R. Laws The Natural History of Whales and Dolphins, Peter G.H. Evans. Christopher Helm, London (1987), xv, +343. Price £13.95 , 1988 .

[28]  Sadaoki Furui,et al.  Digital Speech Processing, Synthesis, and Recognition , 1989 .

[29]  Alain Biem,et al.  A Bayesian model selection criterion for HMM topology optimization , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Richard Lippmann,et al.  A comparison of signal processing front ends for automatic word recognition , 1995, IEEE Trans. Speech Audio Process..

[31]  G. Celeux,et al.  A Classification EM algorithm for clustering and two stochastic versions , 1992 .

[32]  Dan Klein,et al.  Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based approach , 2002, ICML.

[33]  Aapo Hyvärinen,et al.  Survey on Independent Component Analysis , 1999 .

[34]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[35]  H. Akaike A new look at the statistical model identification , 1974 .

[36]  Chris Chatfield,et al.  The Analysis of Time Series: An Introduction , 1981 .

[37]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[38]  Mari Ostendorf,et al.  HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..

[39]  Yoshua Bengio,et al.  Convergence Properties of the K-Means Algorithms , 1994, NIPS.