Sparseness Achievement in Hidden Markov Models

In this paper, a novel learning algorithm for Hidden Markov Models (HMMs) has been devised. The key issue is the achievement of a sparse model, i.e., a model in which all irrelevant parameters are set exactly to zero. Alternatively to standard maximum likelihood estimation (Baum Welch training), in the proposed approach the parameters estimation problem is cast into a Bayesian framework, with the introduction of a negative Dirichlet prior, which strongly encourages sparseness of the model. A modified Expectation Maximization algorithm has been devised, able to determine a MAP (maximum a posteriori probability) estimate of HMM parameters in this Bayesian formulation. Theoretical considerations and experimental comparative evaluations on a 2D shape classification task contribute to validate the proposed technique.

[1]  Mário A. T. Figueiredo Adaptive Sparseness for Supervised Learning , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Mário A. T. Figueiredo,et al.  A sequential pruning strategy for the selection of the number of states in hidden Markov models , 2003, Pattern Recognit. Lett..

[3]  Ching Y. Suen,et al.  On the structure of hidden Markov models , 2004, Pattern Recognit. Lett..

[4]  Anders Krogh,et al.  Hidden Markov models for sequence analysis: extension and analysis of the basic method , 1996, Comput. Appl. Biosci..

[5]  Louis A. Liporace,et al.  Maximum likelihood estimation for multivariate observations of Markov sources , 1982, IEEE Trans. Inf. Theory.

[6]  Gerhard Rigoll,et al.  Hidden Markov model based continuous online gesture recognition , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[7]  PaperNo Recognition of shapes by editing shock graphs , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[8]  Manuele Bicego,et al.  Investigating hidden Markov models' capabilities in 2D shape classification , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  R. Bakis Continuous speech recognition via centisecond acoustic states , 1976 .

[10]  Andreas Stolcke,et al.  Hidden Markov Model} Induction by Bayesian Model Merging , 1992, NIPS.

[11]  Shigeki Sagayama,et al.  A successive state splitting algorithm for efficient allophone modeling , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Biing-Hwang Juang,et al.  Maximum likelihood estimation for multivariate mixture observations of markov chains , 1986, IEEE Trans. Inf. Theory.

[13]  Fatos T. Yarman-Vural,et al.  A shape descriptor based on circular hidden Markov model , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[14]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  J. Rissanen Stochastic Complexity and Modeling , 1986 .

[17]  Shiro Ikeda Construction of Phoneme Models Model Search of Hidden Markov Models , 1993 .

[18]  Jianying Hu,et al.  HMM Based On-Line Handwriting Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[20]  Matthew Brand,et al.  Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction , 1999, Neural Computation.

[21]  KunduAmlan,et al.  2-D Shape Classification Using Hidden Markov Model , 1991 .

[22]  L. Prasanth,et al.  HMM-Based Online Handwriting Recognition System for Telugu Symbols , 2007 .

[23]  Yang He,et al.  2-D Shape Classification Using Hidden Markov Model , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[25]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[26]  M. Ostendorf,et al.  Maximum likelihood successive state splitting , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[27]  H. Akaike A new look at the statistical model identification , 1974 .

[28]  Alex Pentland,et al.  Action Reaction Learning: Automatic Visual Analysis and Synthesis of Interactive Behaviour , 1999, ICVS.

[29]  Matthew Brand,et al.  An Entropic Estimator for Structure Discovery , 1998, NIPS.