GPU acceleration of an image characterization algorithm for document similarity analysis

This paper aims to provide decision support for selecting software and hardware architecture for content-based document comparison. We evaluate Java, C, CUDA C and OpenCL implementations of an image characterization algorithm used for content-based document comparison on a CPU and NVIDIA and AMD graphics processing units (GPUs). Based on our experimental results, we conclude that the original Java implementation of the image characterization algorithm running on a CPU-based architecture can be accelerated by a factor of 6 if the Java code is re-implemented in C, or by a factor of almost 16 if the Java code is re-implemented in CUDA C and run on NVIDIA GTX 480 GPU hardware. We also provide a power efficiency analysis.