A protocol to characterize the descriptive power and the complementarity of shape descriptors

Most document analysis applications rely on the extraction of shape descriptors, which may be grouped into different categories, each category having its own advantages and drawbacks (O.R. Terrades et al. in Proceedings of ICDAR’07, pp. 227–231, 2007). In order to improve the richness of their description, many authors choose to combine multiple descriptors. Yet, most of the authors who propose a new descriptor content themselves with comparing its performance to the performance of a set of single state-of-the-art descriptors in a specific applicative context (e.g. symbol recognition, symbol spotting...). This results in a proliferation of the shape descriptors proposed in the literature. In this article, we propose an innovative protocol, the originality of which is to be as independent of the final application as possible and which relies on new quantitative and qualitative measures. We introduce two types of measures: while the measures of the first type are intended to characterize the descriptive power (in terms of uniqueness, distinctiveness and robustness towards noise) of a descriptor, the second type of measures characterizes the complementarity between multiple descriptors. Characterizing upstream the complementarity of shape descriptors is an alternative to the usual approach where the descriptors to be combined are selected by trial and error, considering the performance characteristics of the overall system. To illustrate the contribution of this protocol, we performed experimental studies using a set of descriptors and a set of symbols which are widely used by the community namely ART and SC descriptors and the GREC 2003 database.

[1]  Douglas A. Reynolds,et al.  SHEEP, GOATS, LAMBS and WOLVES A Statistical Analysis of Speaker Performance in the NIST 1998 Speaker Recognition Evaluation , 1998 .

[2]  Dorothea Blostein,et al.  Graphics Recognition Algorithms and Applications , 2002, Lecture Notes in Computer Science.

[3]  I.T. Phillips,et al.  Performance evaluation of line drawing recognition systems , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[4]  Ernest Valveny,et al.  Performance Characterization of Shape Descriptors for Symbol Representation , 2007, GREC.

[5]  Yannis Manolopoulos,et al.  Structure-based similarity search with graph histograms , 1999, Proceedings. Tenth International Workshop on Database and Expert Systems Applications. DEXA 99.

[6]  Anil K. Jain,et al.  Feature extraction methods for character recognition-A survey , 1996, Pattern Recognit..

[7]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[8]  Ihsin T. Phillips,et al.  The Second International Graphics Recognition Contest - Raster to Vector Conversion: A Report , 1997, GREC.

[9]  Dov Dori,et al.  Extended Summary of the Arc Segmentation Contest , 2001, GREC.

[10]  Muriel Visani,et al.  Comparing Robustness of Two-Dimensional PCA and Eigenfaces for Face Recognition , 2004, ICIAR.

[11]  Jitendra Malik,et al.  Efficient shape matching using shape contexts , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Ernest Valveny,et al.  Evaluation of Graph Matching Measures for Documents Retrieval , 2009 .

[13]  Josep Lladós,et al.  Graphics Recognition. Recent Advances and Perspectives , 2003, Lecture Notes in Computer Science.

[14]  Kagan Tumer,et al.  Analysis of decision boundaries in linearly combined neural classifiers , 1996, Pattern Recognit..

[15]  Ernest Valveny,et al.  A general framework for the evaluation of symbol recognition methods , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[16]  Ihsin T. Phillips,et al.  Empirical Performance Evaluation of Graphics Recognition Systems , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Luciano da Fontoura Costa,et al.  Shape Analysis and Classification: Theory and Practice , 2000 .

[18]  Tony P. Pridmore,et al.  Building Synthetic Graphical Documents for Performance Evaluation , 2007, GREC.

[19]  Ernest Valveny,et al.  Report on the Second Symbol Recognition Contest , 2005, GREC.

[20]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[21]  Guojun Lu,et al.  Review of shape representation and description techniques , 2004, Pattern Recognit..

[22]  Ernest Valveny,et al.  A Review of Shape Descriptors for Document Analysis , 2007 .

[23]  Salvatore Tabbone,et al.  Graph Matching Based on Node Signatures , 2009, GbRPR.

[24]  Francisco Escolano,et al.  Graph-Based Representations in Pattern Recognition, 7th IAPR-TC-15 International Workshop, GbRPR 2009, Venice, Italy, May 26-28, 2009. Proceedings , 2009, GbRPR.

[25]  Ernest Valveny,et al.  Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Robert P. W. Duin,et al.  Bagging, Boosting and the Random Subspace Method for Linear Classifiers , 2002, Pattern Analysis & Applications.

[27]  Robert M. Haralick,et al.  Nonlinear global and local document degradation models , 1994, Int. J. Imaging Syst. Technol..

[28]  Josef Kittler,et al.  A Framework for Classifier Fusion: Is It Still Needed? , 2000, SSPR/SPR.

[29]  Karl Tombre,et al.  Graphics Recognition Algorithms and Systems , 1997, Lecture Notes in Computer Science.

[30]  Ernest Valveny,et al.  On the Combination of Ridgelets Descriptors for Symbol Recognition , 2007, GREC.