Une approche de localisation de symboles non-segmentés dans des documents graphiques A symbol spotting approach in graphic documents

and key words This paper addresses the problem of symbol spotting for graphic documents. We propose an approach where each graphic document is indexed as a text document by using the vector model and an inverted file structure. The method relies on a visual vocabulary built from a shape descriptor adapted to the document level and invariant under classical geometric transforms (rotation, scaling and translation). Regions of interest (ROI) selected with high degree of confidence using a voting strategy are considered as occurrences of a query symbol. The symbol spotting problem consists in locating all instances of a symbol embedded in documents. The representation of these symbols is not straightforward by using a good shape (symbol) descriptor because they are not isolated from their context. Therefore, a common strategy for symbol spotting consists in decomposing documents into components and in applying a shape descriptor on each of them. A vectorization step is needed for most of the approaches and usually, only symbols which satisfy some conditions are retrieved (eg. convexity, connectivity, closure, ...). Our objective is to tackle the problem from a point of view where neither symbol hypothesis nor vectorization step is needed. First of all, we proposed a descriptor to represent graphic symbols and its extension to document level. Then, we exploit a traitement du signal 2009_volume 26_numero special 5 Le document ecrit 419 1. Traduit de “Shape Context”.

[1]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[2]  Salvatore Tabbone,et al.  Behavior of the Laplacian of Gaussian Extrema , 2005, Journal of Mathematical Imaging and Vision.

[3]  Josep Lladós,et al.  Symbol Spotting in Technical Drawings Using Vectorial Signatures , 2005, GREC.

[4]  Laurent Wendling,et al.  Recognition of symbols in grey level line-drawings from an adaptation of the Radon transform , 2004, ICPR 2004.

[5]  Jacques Labiche,et al.  Symbol Spotting using Full Visibility Graph Representation , 2007 .

[6]  Andrew Zisserman,et al.  Video Google: Efficient Visual Search of Videos , 2006, Toward Category-Level Object Recognition.

[7]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[10]  S. Tabbone,et al.  An Indexing Method for Graphical Documents , 2007 .

[11]  Joshua R. Smith,et al.  Image retrieval evaluation , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[12]  Sing Bing Kang,et al.  Emerging Topics in Computer Vision , 2004 .

[13]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Thomas S. Huang,et al.  A Modified Fourier Descriptor for Shape Matching in MARS , 1998, Image Databases and Multi-Media Search.

[15]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[16]  Salvatore Tabbone,et al.  Une méthode de localisation et de reconnaissance de symboles sans connaissance a priori , 2006 .

[17]  Josep Lladós,et al.  A Region-Based Hashing Approach for Symbol Spotting in Technical Documents , 2007, GREC.

[18]  Joaquim A. Jorge,et al.  Content-based retrieval of technical drawings , 2005, Int. J. Comput. Appl. Technol..

[19]  Guojun Lu,et al.  Study and evaluation of different Fourier methods for image retrieval , 2005, Image Vis. Comput..

[20]  Cordelia Schmid,et al.  Comparing and evaluating interest points , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[21]  Tony P. Pridmore,et al.  Building Synthetic Graphical Documents for Performance Evaluation , 2007, GREC.

[22]  Richard J. Prokop,et al.  A survey of moment-based techniques for unoccluded object representation and recognition , 1992, CVGIP Graph. Model. Image Process..

[23]  Wenyin Liu,et al.  An interactive example-driven approach to graphics recognition in engineering drawings , 2006, International Journal of Document Analysis and Recognition (IJDAR).