A similarity measure between vector sequences with application to handwritten word image retrieval

This article proposes a novel similarity measure between vector sequences. Recently, a model-based approach was introduced to address this issue. It consists in modeling each sequence with a continuous Hidden Markov Model (CHMM) and computing a probabilistic measure of similarity between C-HMMs. In this paper we propose to model sequences with semi-continuous HMMs (SC-HMMs): the Gaussians of the SC-HMMs are constrained to belong to a shared pool of Gaussians. This constraint provides two major benefits. First, the a priori information contained in the common set of Gaussians leads to a more accurate estimate of the HMM parameters. Second, the computation of a probabilistic similarity between two SC-HMMs can be simplified to a Dynamic Time Warping (DTW) between their mixture weight vectors, which reduces significantly the computational cost. Experimental results on a handwritten word retrieval task show that the proposed similarity outperforms the traditional DTW between the original sequences, and the model-based approach which uses C-HMMs. We also show that this increase in accuracy can be traded against a significant reduction of the computational cost (up to 100 times).

[1]  C. V. Jawahar,et al.  Probabilistic Reverse Annotation for Large Scale Image Retrieval , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Pinar Duygulu Sahin,et al.  Matching ottoman words: an image retrieval approach to historical document indexing , 2007, CIVR '07.

[3]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[4]  Xuedong Huang,et al.  Semi-continuous hidden Markov models for speech signals , 1990 .

[5]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[6]  R. Manmatha,et al.  Features for word spotting in historical manuscripts , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[7]  Horst Bunke,et al.  Using a Statistical Language Model to Improve the Performance of an HMM-Based Cursive Handwriting Recognition System , 2001, Int. J. Pattern Recognit. Artif. Intell..

[8]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[9]  Alex Waibel,et al.  Readings in speech recognition , 1990 .

[10]  Tony Jebara,et al.  Spectral Clustering and Embedding with Hidden Markov Models , 2007, ECML.

[11]  R. Manmatha,et al.  Word spotting for historical documents , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[12]  R. Manmatha,et al.  Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[13]  Özgür Ulusoy,et al.  Content-based retrieval of historical Ottoman documents stored as textual images , 2004, IEEE Transactions on Image Processing.

[14]  Claus Bahlmann,et al.  Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[15]  Wei Li,et al.  A DTW-based dissimilarity measure for left-to-right hidden Markov models and its application to word confusability analysis , 2006, INTERSPEECH.

[16]  Tony Jebara,et al.  Probability Product Kernels , 2004, J. Mach. Learn. Res..

[17]  John R. Hershey,et al.  Variational Kullback-Leibler divergence for Hidden Markov models , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[18]  Edward M. Riseman,et al.  Word spotting: a new approach to indexing handwriting , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  John R. Hershey,et al.  Variational Bhattacharyya divergence for hidden Markov models , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[21]  Lambert Schomaker,et al.  Handwritten-Word Spotting Using Biologically Inspired Features , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  David A. Forsyth,et al.  Searching Off-line Arabic Documents , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[23]  Shiri Gordon,et al.  An efficient image similarity measure based on approximations of KL-divergence between two gaussian mixtures , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[24]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.