A Modified Isomap Approach to Manifold Learning in Word Spotting

Word spotting is an effective paradigm for indexing document images with minimal human effort. Here, the use of the Bag-of-Features principle has been shown to achieve competitive results on different benchmarks. Recently, a spatial pyramid approach was used as a word image representation to improve the retrieval results even further. The high dimensionality of the spatial pyramids was attempted to be countered by applying Latent Semantic Analysis. However, this leads to increasingly worse results when reducing to lower dimensions. In this paper, we propose a new approach to reducing the dimensionality of word image descriptors which is based on a modified version of the Isomap Manifold Learning algorithm. This approach is able to not only outperform Latent Semantic Analysis but also to reduce a word image descriptor to up to \(0.12\,\%\) of its original size without losing retrieval precision. We evaluate our approach on two different datasets.

[1]  Nicolas Le Roux,et al.  Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering , 2003, NIPS.

[2]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[3]  Ammad Ali,et al.  Face Recognition with Local Binary Patterns , 2012 .

[4]  Ernest Valveny,et al.  Deformable HOG-Based Shape Descriptor , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[5]  R. Manmatha,et al.  Word spotting for historical documents , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[6]  张振跃,et al.  Principal Manifolds and Nonlinear Dimensionality Reduction via Tangent Space Alignment , 2004 .

[7]  Ernest Valveny,et al.  Word Spotting and Recognition with Embedded Attributes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Gernot A. Fink,et al.  Bag-of-Features HMMs for Segmentation-Free Word Spotting in Handwritten Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[9]  Nicolas Le Roux,et al.  Learning Eigenfunctions Links Spectral Embedding and Kernel PCA , 2004, Neural Computation.

[10]  Josep Lladós,et al.  Efficient segmentation-free keyword spotting in historical document collections , 2015, Pattern Recognit..

[11]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[12]  Josep Lladós,et al.  Integrating Visual and Textual Cues for Query-by-String Word Spotting , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[13]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[14]  Joshua B. Tenenbaum,et al.  Global Versus Local Methods in Nonlinear Dimensionality Reduction , 2002, NIPS.