Parallel Implementation of the Integral Histogram

The integral histogram is a recently proposed preprocessing technique to compute histograms of arbitrary rectangular gridded (i.e. image or volume) regions in constant time. We formulate a general parallel version of the the integral histogram and analyse its implementation in Star Superscalar (StarSs). StarSs provides a uniform programming and runtime environment and facilitates the development of portable code for heterogeneous parallel architectures. In particular, we discuss the implementation for the multi-core IBM Cell Broadband Engine (Cell/B.E.) and provide extensive performance measurements and tradeoffs using two different scan orders or histogram propagation methods. For 640 × 480 images, a tile or block size of 28 × 28 and 16 histogram bins the parallel algorithm is able to reach greater than real-time performance of more than 200 frames per second.

[1]  Quang Nguyen,et al.  The parallelization of video processing , 2009, IEEE Signal Processing Magazine.

[2]  Jesús Labarta,et al.  CellSs: Making it easier to program the Cell Broadband Engine processor , 2007, IBM J. Res. Dev..

[3]  Ramón López de Mántaras,et al.  Fast and robust object segmentation with the Integral Linear Classifier , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[5]  Guna Seetharaman,et al.  Flux Tensor Constrained Geodesic Active Contours with Sensor Fusion for Persistent Object Tracking , 2007, J. Multim..

[6]  Rodney A. Kennedy,et al.  A Survey of Medical Image Registration on Multicore and the GPU , 2010, IEEE Signal Processing Magazine.

[7]  Fatih Murat Porikli,et al.  Integral histogram: a fast way to extract histograms in Cartesian spaces , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  C. Kambhamettu,et al.  GPU implementation of belief propagation using CUDA for cloud tracking and reconstruction , 2008, 2008 IAPR Workshop on Pattern Recognition in Remote Sensing (PRRS 2008).

[9]  Eduard Ayguadé,et al.  Hierarchical Task-Based Programming With StarSs , 2009, Int. J. High Perform. Comput. Appl..

[10]  Taku Komura,et al.  Automatic Panel Extraction of Color Comic Images , 2007, PCM.

[11]  Guna Seetharaman,et al.  Geodesic Active Contour Based Fusion of Visible and Infrared Video for Persistent Object Tracking , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[12]  Kannappan Palaniappan,et al.  Moving Object Segmentation Using the Flux Tensor for Biological Video Microscopy , 2007, PCM.

[13]  Yichen Wei,et al.  Efficient histogram-based sliding window , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  G. Blake,et al.  A survey of multicore processors , 2009, IEEE Signal Processing Magazine.

[15]  Kannappan Palaniappan,et al.  Parallel Implementation of Video Surveillance Algorithms on GPU Architecture using CUDA , 2009 .

[16]  Guna Seetharaman,et al.  Efficient feature extraction and likelihood fusion for vehicle tracking in low frame rate airborne video , 2010, 2010 13th International Conference on Information Fusion.

[17]  Alex L. Chan A Description on the Second Dataset of the U.S. Army: Research Laboratory Force Protection Surveillance System , 2009 .

[18]  Guna Seetharaman,et al.  Parallel Blob Extraction Using the Multi-core Cell Processor , 2009, ACIVS.

[19]  Dmitry B. Goldgof,et al.  Tracking Nonrigid Motion and Structure from 2D Satellite Cloud Images without Correspondences , 2001, IEEE Trans. Pattern Anal. Mach. Intell..