An Indexing Method for Graphical Documents

In this paper, a method to browse symbols into graphical documents is presented. More precisely, we propose a combined filtering and indexing mechanism that retrieves in an efficient way the most similar symbols to a given input query. For a database of 200000 symbols the retrieval time has been divided by a factor of 4, 5 compared to a linear search.

[1]  Salvatore Tabbone,et al.  A Method for Symbol Spotting in Graphical Documents , 2006, Document Analysis Systems.

[2]  Sid-Ahmed Berrani Recherche approximative de plus proches voisins avec contrôle probabiliste de la précision ; application à la recherche d'images par le contenu. (Approximate search of nearest neighbors with probabilistic control of the accuracy; application to content-based image retrieval) , 2004 .

[3]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[4]  Andreas Henrich,et al.  The LSD/sup h/-tree: an access structure for feature vectors , 1998, Proceedings 14th International Conference on Data Engineering.

[5]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[6]  J. T. Robinson,et al.  The K-D-B-tree: a search structure for large multidimensional dynamic indexes , 1981, SIGMOD '81.

[7]  Stelios C. A. Thomopoulos,et al.  DIGNET: A self-organizing neural network for automatic pattern recognition and classification , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[8]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[9]  Atul K. Chhabra,et al.  Symbol Recognition : An Overview , 2005 .

[10]  Mario Vento,et al.  Symbol recognition in documents: a collection of techniques? , 2000, International Journal on Document Analysis and Recognition.