LDA-Based Word Image Representation for Keyword Spotting on Historical Mongolian Documents

The original Bag-of-Visual-Words approach discards the spatial relations of the visual words. In this paper, a LDA-based topic model is adopted to obtain the semantic relations of visual words for each word image. Because the LDA-based topic model usually hurts retrieval performance when directly employs itself. Therefore, the LDA-based topic model is linearly combined with a visual language model for each word image in this study. After that, the basic query likelihood model is used for realizing the procedure of retrieval. The experimental results on our dataset show that the proposed LDA-based representation approach can efficiently and accurately attain to the aim of keyword spotting on a collection of historical Mongolian documents. Meanwhile, the proposed approach improves the performance significantly than the original BoVW approach.

[1]  Guanglai Gao,et al.  An efficient binarization method for ancient Mongolian document images , 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering(ICACTE).

[2]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Guanglai Gao,et al.  A keyword retrieval system for historical Mongolian document images , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[4]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[6]  Hai Jin,et al.  Weighting scheme for image retrieval based on bag-of-visual-words , 2014, IET Image Process..

[7]  Xin Chen,et al.  Spatial Weighting for Bag-of-Visual-Words and Its Application in Content-Based Image Retrieval , 2009, PAKDD.

[8]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[9]  Edward M. Riseman,et al.  Indexing handwriting using word matching , 1996, DL '96.

[10]  Hugo Jair Escalante,et al.  Improving the BoVW via discriminative visual n-grams and MKL strategies , 2016, Neurocomputing.

[11]  Nenghai Yu,et al.  Visual language modeling for image classification , 2007, MIR '07.

[12]  R. Manmatha,et al.  Features for word spotting in historical manuscripts , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[13]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[14]  R. Manmatha,et al.  Word image matching using dynamic time warping , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  John D. Lafferty,et al.  A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval , 2017, SIGF.

[17]  Pierre Tirilly,et al.  Distances and weighting schemes for bag of visual words image retrieval , 2010, MIR '10.