论文信息 - Word image matching using dynamic time warping

Word image matching using dynamic time warping

Libraries and other institutions are interested in providing access to scanned versions of their large collections of handwritten historical manuscripts on electronic media. Convenient access to a collection requires an index, which is manually created at great labor and expense. Since current handwriting recognizers do not perform well on historical documents, a technique called word spotting has been developed: clusters with occurrences of the same word in a collection are established using image matching. By annotating "interesting" clusters, an index can be built automatically. We present an algorithm for matching handwritten words in noisy historical documents. The segmented word images are preprocessed to create sets of 1-dimensional features, which are then compared using dynamic time warping. We present experimental results on two different data sets from the George Washington collection. Our experiments show that this algorithm performs better and is faster than competing matching techniques.

R. Manmatha | Toni M. Rath | R. Manmatha | T. Rath

[1] H. C. Longuet-Higgins,et al. An algorithm for associating the features of two images , 1991, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[2] S. Chiba,et al. Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[3] R. Manmatha,et al. Scale Space Technique for Word Segmentation in Handwritten Documents , 1999, Scale-Space.

[4] W. Bruce Croft,et al. Word Spotting: Indexing Handwritten Archives , 1997 .

[5] R. Manmatha,et al. Scale Space Technique for Word Segmentation in Handwritten Manuscripts , 1999 .

[6] David Sankoff,et al. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison , 1983 .

[7] Joshua Alspector,et al. A Line-Oriented Approach to Word Spotting in Handwritten Documents , 2000, Pattern Analysis & Applications.

[8] Bin Zhang,et al. Transcript mapping for historic handwritten document images , 2002, Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition.

[9] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.

[10] Shane S. Sturrock,et al. Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[11] Chien-Huei Chen. Lexicon-driven word recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.