Word-Graph Based Handwriting Key-Word Spotting: Impact of Word-Graph Size on Performance

Key-Word Spotting (KWS) in handwritten documents is approached here by means of Word Graphs (WG) obtained using segmentation-free handwritten text recognition technology based on N-gram Language Models and Hidden Markov Models. Linguistic context significantly boost KWS performance with respect to methods which ignore word contexts and/or rely on image-matching with pre-segmented isolated words. On the other hand, WG-based KWS can be significantly faster than other KWS approaches which directly work on the original images where, in general, computational demands are exceedingly high. A large WG contains most of the relevant information of the original text (line) image needed for KWS but, if it is too large, the computational advantages over traditional, image matching-based KWS become diminished. Conversely, if it is too small, relevant information may be lost, leading to degraded KWS precision/recall performance. We study the trade off between WG size and KWS information retrieval performance. Results show that small, computationally cheap WGs can be used without loosing the excellent KWS performance achieved with huge WGs.

[1]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[2]  Hermann Ney,et al.  Confidence measures for large vocabulary continuous speech recognition , 2001, IEEE Trans. Speech Audio Process..

[3]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Andreas Keller,et al.  Lexicon-free handwritten word spotting using character HMMs , 2012, Pattern Recognit. Lett..

[5]  Volkmar Frinken,et al.  A Novel Word Spotting Method Based on Recurrent Neural Networks , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[7]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[8]  Stephen E. Robertson,et al.  A new interpretation of average precision , 2008, SIGIR '08.

[9]  Alejandro Héctor Toselli Rossi,et al.  Multimodal Interactive Handwritten Text Transcription , 2012, Series in Machine Perception and Artificial Intelligence.

[10]  Samy Bengio,et al.  Offline recognition of unconstrained handwritten texts using HMMs and statistical language models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  N. Strom Generation and Minimization of Word Graphs in Continuous Speech Recognition , 2007 .

[12]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[13]  Hermann Ney,et al.  Word graphs: an efficient interface between continuous-speech recognition and language understanding , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[15]  Volkmar Frinken,et al.  HMM word graph based keyword spotting in handwritten document images , 2016, Inf. Sci..

[16]  Alejandro Héctor Toselli Rossi,et al.  Fast HMM-Filler Approach for Key Word Spotting in Handwritten Documents , 2013, 2013 12th International Conference on Document Analysis and Recognition.