A geometric approach for accurate and efficient performance evaluation of layout analysis methods

A major component of performance evaluation of layout analysis methods is the comparison of ground truth regions with regions resulting from segmentation methods. The description of document regions must be both accurate in describing complex layouts and efficient in view of the large number of region comparisons that must be performed. Previous approaches favour either accuracy or efficiency, resulting in an impractical compromise. This paper presents an improved approach that uses polygons to accurately describe both segmentation and ground truth regions. Polygonal descriptions are efficiently compared using a rectangular interval based decomposition. This approach has been validated using data from the ICDAR page segmentation competitions.

[1]  Basilios Gatos,et al.  ICDAR2005 page segmentation competition , 2007, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[2]  Basilios Gatos,et al.  ICDAR 2003 page segmentation competition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  Amit Kumar Das,et al.  An empirical measure of the performance of a document image segmentation algorithm , 2002, International Journal on Document Analysis and Recognition.

[4]  Apostolos Antonacopoulos,et al.  Performance Analysis Framework for Layout Analysis Methods , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[5]  Apostolos Antonacopoulos,et al.  ICDAR 2009 Page Segmentation Competition , 2003, 2009 10th International Conference on Document Analysis and Recognition.

[6]  Volker Märgner,et al.  A General Approach to Quality Evaluation of Document Segmentation Results , 1998, Document Analysis Systems.

[7]  Frans Coenen,et al.  Region description and comparative analysis using a tesseral representation , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[8]  Junichi Kanai Automated performance evaluation of document image analysis systems: Issues and practice , 1996, Int. J. Imaging Syst. Technol..

[9]  Song Mao,et al.  Software architecture of PSET: a page segmentation evaluation toolkit , 2002, International Journal on Document Analysis and Recognition.

[10]  Apostolos Antonacopoulos,et al.  Ground Truth for Layout Analysis Performance Evaluation , 2006, Document Analysis Systems.

[11]  Tim Ritchings,et al.  Representation and classification of complex-shaped printed regions using white tiles , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[12]  Apostolos Antonacopoulos,et al.  Methodology for flexible and efficient analysis of the performance of page segmentation algorithms , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[13]  Luc Vincent,et al.  Pink Panther: A Complete Environment For Ground-Truthing And Benchmarking Document Page Segmentation , 1998, Pattern Recognit..