A single-cycle parallel multi-slice connected components analysis hardware architecture

Abstract In this paper, a memory-efficient architecture for single-pass connected components analysis suited for high-throughput embedded image processing systems is proposed which achieves a speedup by partitioning the image into slices. Although global data dependencies of image segments spanning several image slices exist, a temporal and spatial local algorithm is proposed, together with a suited FPGA hardware architecture processing pixel data at low latency. The low latency of the proposed architecture allows reuse of labels associated with the image objects. This reduces the amount of memory by a factor of more than 5 in the considered implementations which is a significant contribution since memory is a critical resource in embedded image processing on FPGAs. Therefore, a significantly higher bandwidth of pixel data can be processed with this architecture compared to the state-of-the-art architectures using the same amount of hardware resources.

[1]  Donald G. Bailey,et al.  Single Pass Connected Components Analysis , 2007 .

[2]  Narayanan Vijaykrishnan,et al.  A Scalable Bandwidth Aware Architecture for Connected Component Labeling , 2010, 2010 IEEE Computer Society Annual Symposium on VLSI.

[3]  Fei Zhao,et al.  Real-time single-pass connected components analysis algorithm , 2013, EURASIP J. Image Video Process..

[4]  Tsung-Han Tsai,et al.  A scalable parallel hardware architecture for connected component labeling , 2010, 2010 IEEE International Conference on Image Processing.

[5]  B. Hoppe,et al.  Development of a FPGA Based Real-Time Blob Analysis Circuit , 2007 .

[6]  Donald G. Bailey,et al.  Optimised single pass connected components analysis , 2008, 2008 International Conference on Field-Programmable Technology.

[7]  Donald G. Bailey,et al.  FPGA implementation of a Single Pass Connected Components Algorithm , 2008, 4th IEEE International Symposium on Electronic Design, Test and Applications (delta 2008).

[8]  Zhe Wang,et al.  A memory-efficient parallel single pass architecture for connected component labeling of streamed images , 2012, 2012 International Conference on Field-Programmable Technology.

[9]  Micha Sharir,et al.  Top-Down Analysis of Path Compression , 2005, SIAM J. Comput..

[10]  Andrew Hunter,et al.  A run-length based connected component algorithm for FPGA implementation , 2008, 2008 International Conference on Field-Programmable Technology.

[11]  Donald G. Bailey,et al.  A high-throughput FPGA architecture for parallel connected components analysis based on label reuse , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[12]  Jan van Leeuwen,et al.  Worst-case Analysis of Set Union Algorithms , 1984, JACM.

[13]  Donald G. Bailey,et al.  Connected components analysis of streamed images , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[14]  Azriel Rosenfeld,et al.  Sequential Operations in Digital Picture Processing , 1966, JACM.

[15]  Jeffrey D. Ullman,et al.  Set Merging Algorithms , 1973, SIAM J. Comput..

[16]  Michael J. Fischer,et al.  An improved equivalence algorithm , 1964, CACM.