A Proposition of Retrieval Tools for Historical Document Images Libraries

In this article, we propose a method of characterization of pictures of old documents based on a texture approach. This characterization is carried out with the help of a multi- resolution study of the textures contained in the pictures of the document. So, by extracting five features linked to the frequencies and to the orientations in the different parts of a page, it is possible to extract and to compare elements of high semantic level without expressing any hypothesis about the physical or logical structure of the analysed documents. Experiments show the feasibility of the fulfillment of tools for the navigation or the indexation help. In these experimentations, we will lay the emphasis upon the pertinence of these texture features and the advances that they represent in terms of characterization of content of a deeply heterogeneous corpus.

[1]  Hubert Emptoz,et al.  Type extraction and character prototyping using gabor filters , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[2]  Jean-Yves Ramel,et al.  AGORA: the interactive document image analysis tool of the BVH project , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[3]  C. V. Jawahar,et al.  Content-level Annotation of Large Collection of Printed Document Images , 2007 .

[4]  Venu Govindaraju,et al.  Text - image separation in Devanagari documents , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5]  Genane Youness,et al.  Une Méthodologie pour la Comparaison de Partitions , 2004 .

[6]  Yousri Kessentini,et al.  Handwritten document segmentation using hidden Markov random fields , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[7]  Giovanni Soda,et al.  Tree clustering for layout-based document image retrieval , 2006, Second International Conference on Document Image Analysis for Libraries (DIAL'06).

[8]  Robert M. Haralick,et al.  Zone classification using texture features , 1996, Proceedings of 13th International Conference on Pattern Recognition.