Performance Evaluation Tools for Zone Segmentation and Classification (PETS)

This paper describes a set of Performance Evaluation Tools (PETS) for document image zone segmentation and classification. The tools allow researchers and developers to evaluate, optimize and compare their algorithms by providing a variety of quantitative performance metrics. The evaluation of segmentation quality is based on the pixel-based overlaps between two sets of zones proposed by Randriamasy and Vincent. PETS extends the approach by providing a set of metrics for overlap analysis, RLE and polygonal representation of zones and introduces type-matching to evaluate zone classification. The software is available for research use.

[1]  George Nagy,et al.  Performance metrics for document understanding systems , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  George Nagy,et al.  Automated Evaluation of OCR Zoning , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  David S. Doermann,et al.  Voronoi++: A Dynamic Page Segmentation Approach Based on Voronoi and Docstrum Features , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Yi Li,et al.  Script-Independent Text Line Segmentation in Freestyle Handwritten Documents , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Richard Rogers,et al.  UW-ISL document image analysis toolbox: an experimental environment , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[6]  Wael Abd-Almageed,et al.  Document-zone classification using partial least squares and hybrid classifiers , 2008, 2008 19th International Conference on Pattern Recognition.

[7]  Laurence Likforman-Sulem,et al.  Text line segmentation of historical documents: a survey , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[8]  Luc Vincent,et al.  Pink Panther: A Complete Environment For Ground-Truthing And Benchmarking Document Page Segmentation , 1998, Pattern Recognit..

[9]  Song Mao,et al.  Automatic training of page segmentation algorithms: an optimization approach , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[10]  Luc Vincent,et al.  Ground-truthing and benchmarking document page segmentation , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[11]  Yalin Wang,et al.  Document zone content classification and its performance evaluation , 2006, Pattern Recognit..

[12]  Thomas M. Breuel,et al.  Performance Comparison of Six Algorithms for Page Segmentation , 2006, Document Analysis Systems.