A quantitative cross-architecture study of morphological image processing on CPUs, GPUs, and FPGAs

The rapidly growing applications based on morphological operations in image processing and computer vision make efficient implementations of these key blocks an important topic of research. Nevertheless, a detailed comparison of the energy efficiency and performance of these implementations that covers all available major hardware platforms is still missing. In this paper we evaluate the performance and power consumption of the most efficient available morphological image processing algorithms for CPU, GPU, and FPGA platforms in detail. In addition, we study the suitability of available morphological library units for high-level synthesis and compare the results with an optimized hand-coded FPGA implementation. We demonstrate that even high-end GPUs cannot achieve the throughputs of modern CPUs and FPGAs by far. Our experimental results show that an FPGA implementation is 8-10 times more energy efficient for this application, being comparable in speed to CPUs for large kernels.

[1]  Sek M. Chai,et al.  FPGA implementation of a license plate recognition SoC using automatically generated streaming accelerators , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[2]  Donald G. Bailey,et al.  FPGA Implementation of Global Vision for Robot Soccer as a Smart Camera , 2013, RiTA.

[3]  Steven G. Johnson,et al.  FFTW: Fastest Fourier Transform in the West , 2012 .

[4]  Wang Yingli,et al.  Based on the top-hat algorithm of oral medical image information recognition and data processing , 2012, Proceedings of 2012 International Conference on Measurement, Information and Control.

[5]  Stephen Neuendorffer,et al.  Accelerating OpenCV Applications with Zynq-7000 All Programmable SoC using Vivado HLS Video Libraries , 2013 .

[6]  Lian Yu Zhao,et al.  The Mathematical Morphology Image Edge Detection Based on FPGA , 2013 .

[7]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[8]  Vicente Alarcón Aquino,et al.  An FPGA-based architecture for linear and morphological image filtering , 2010, 2010 20th International Conference on Electronics Communications and Computers (CONIELECOMP).

[9]  Jack Dongarra,et al.  LAPACK: a portable linear algebra library for high-performance computers , 1990, SC.

[10]  An FPGA top-hat transform module with two different structure elements , 2012, 2012 IEEE 11th International Conference on Signal Processing.

[11]  G. Jagannathan,et al.  A Novel Open Source Morphology Using GPU Processing With LTU-CUDA , 2014 .

[12]  Jerzy Kasperek Real Time Morphological Image Contrast Enhancement in Virtex FPGA , 2001, FPL.

[13]  Mugdha A. Rane Fast Morphological Image Processing on GPU using CUDA , 2013 .

[14]  Petros Maragos,et al.  Applications of morphological filtering to image analysis and processing , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.