Optimized Block-Based Connected Components Labeling With Decision Trees

In this paper, we define a new paradigm for eight-connection labeling, which employes a general approach to improve neighborhood exploration and minimizes the number of memory accesses. First, we exploit and extend the decision table formalism introducing or-decision tables, in which multiple alternative actions are managed. An automatic procedure to synthesize the optimal decision tree from the decision table is used, providing the most effective conditions evaluation order. Second, we propose a new scanning technique that moves on a 2 × 2 pixel grid over the image, which is optimized by the automatically generated decision tree. An extensive comparison with the state of art approaches is proposed, both on synthetic and real datasets. The synthetic dataset is composed of different sizes and densities random images, while the real datasets are an artistic image analysis dataset, a document analysis dataset for text detection and recognition, and finally a standard resolution dataset for picture segmentation tasks. The algorithm provides an impressive speedup over the state of the art algorithms.

[1]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[2]  Christophe Fiorio,et al.  Two Linear Time Union-Find Strategies for Image Processing , 1996, Theor. Comput. Sci..

[3]  Linda G. Shapiro,et al.  A new connected components algorithm for virtual memory computers , 1983, Comput. Vis. Graph. Image Process..

[4]  Kenneth C. Sevcik,et al.  The synthetic approach to decision table conversion , 1976, CACM.

[5]  Roger L.T. Cederberg Chain-link coding and segmentation for raster scan devices , 1978 .

[6]  Hanan Samet,et al.  Connected Component Labeling Using Quadtrees , 1981, JACM.

[7]  Per-Erik Danielsson An improvement of Kruse's segmentation algorithm , 1981 .

[8]  Zvi Galil,et al.  Data structures and algorithms for disjoint set union problems , 1991, CSUR.

[9]  T. Morrin Chain-link compression of arbitrary black-white images , 1976 .

[10]  Rita Cucchiara,et al.  Fast block based connected components labeling , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[11]  Yijie Han,et al.  An efficient and fast parallel-connected component algorithm , 1990, JACM.

[12]  Arie Shoshani,et al.  Optimizing connected component labeling algorithms , 2005, SPIE Medical Imaging.

[13]  Kenji Suzuki,et al.  A Linear-Time Two-Scan Labeling Algorithm , 2007, 2007 IEEE International Conference on Image Processing.

[14]  Per-Erik Danielsson An Improved Segmentation and Coding Algorithm for Binary and Nonbinary Images , 1982, IBM J. Res. Dev..

[15]  Chun-Jen Chen,et al.  A component-labeling algorithm using contour tracing technique , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[16]  Kenji Suzuki,et al.  Linear-time connected-component labeling based on sequential local operations , 2003, Comput. Vis. Image Underst..

[17]  Hanan Samet,et al.  A general approach to connected-component labeling for arbitrary image representations , 1992, JACM.

[18]  Azriel Rosenfeld,et al.  Sequential Operations in Digital Picture Processing , 1966, JACM.

[19]  R. M. Haralick Some Neighborhood Operators , 1981 .

[20]  Jon Kaufmann Clemens Optical character recognition for reading machine applications. , 1965 .

[21]  Kenji Suzuki,et al.  A Run-Based Two-Scan Labeling Algorithm , 2008, IEEE Transactions on Image Processing.

[22]  Lewis T. Reinwald,et al.  Conversion of Limited-Entry Decision Tables to Optimal Computer Programs I: Minimum Average Processing Time , 1966, JACM.

[23]  Kesheng Wu,et al.  Fast connected-component labeling , 2009, Pattern Recognit..

[24]  Luigi di Stefano,et al.  A simple and efficient connected components labeling algorithm , 1999, Proceedings 10th International Conference on Image Analysis and Processing.