Provisional chapter Algorithms for Efficient Computation of Convolution

Convolution is an important mathematical tool in both fields of signal and image processing. It is employed in filtering [1, 2], denoising [3], edge detection [4, 5], correlation [6], compression [7, 8], deconvolution [9, 10], simulation [11, 12], and in many other applications. Although the concept of convolution is not new, the efficient computation of convolution is still an open topic. As the amount of processed data is constantly increasing, there is considerable request for fast manipulation with huge data. Moreover, there is demand for fast algorithms which can exploit computational power of modern parallel architectures.

[1]  F. Harris On the use of windows for harmonic analysis with the discrete Fourier transform , 1978, Proceedings of the IEEE.

[2]  Rudy Lauwereins,et al.  Real-time accurate stereo with bitwise fast voting on CUDA , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[3]  Don H. Johnson,et al.  Gauss and the history of the fast Fourier transform , 1984, IEEE ASSP Magazine.

[4]  Xinxin Wang,et al.  GPU implemention of fast Gabor filters , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[5]  Hong Shan Neoh,et al.  Adaptive Edge Detection for Real-Time Video Processing using FPGAs , 2005 .

[6]  Jim R. Parker,et al.  Algorithms for image processing and computer vision , 1996 .

[7]  Pavel Zemcík,et al.  Real-time object detection on CUDA , 2010, Journal of Real-Time Image Processing.

[8]  Sabine Pruggnaller,et al.  Performance evaluation of image processing algorithms on the GPU. , 2008, Journal of structural biology.

[9]  Adam Herout,et al.  Low-Level Image Features for Real-Time Object Detection , 2010 .

[10]  Stephen A. Dyer,et al.  Digital signal processing , 2018, 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004..

[11]  David Salomon,et al.  Data Compression: The Complete Reference , 2006 .

[12]  Naga K. Govindaraju,et al.  High performance discrete Fourier transforms on graphics processors , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[13]  Satoshi Matsuoka,et al.  Bandwidth intensive 3-D FFT kernel for GPUs using CUDA , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[14]  Jiri Jan Digital Signal Filtering, Analysis and Restoration (Telecommunications Series) , 2000 .

[15]  G. A. Ruiz Design And Architectures For Digital Signal Processing , 2014 .

[16]  Wei Chen,et al.  High performance median filtering using commodity graphics hardware , 2009, 2009 IEEE Nuclear Science Symposium Conference Record (NSS/MIC).

[17]  Charalambos D. Stamopoulos Parallel Image Processing , 1975, IEEE Transactions on Computers.

[18]  SkadronKevin,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008 .

[19]  Victor Podlozhnyuk,et al.  Image Convolution with CUDA , 2007 .

[20]  Marco Lanuzza,et al.  A high-performance fully reconfigurable FPGA-based 2D convolution processor , 2005, Microprocess. Microsystems.

[21]  H. Nussbaumer Fast Fourier transform and convolution algorithms , 1981 .

[22]  Kevin Skadron,et al.  A performance study of general-purpose applications on graphics processors using CUDA , 2008, J. Parallel Distributed Comput..

[23]  Alan R. Jones,et al.  Fast Fourier Transform , 1970, SIGP.

[24]  Ali M. Reza,et al.  FPGA implementation of adaptive temporal Kalman filter for real time video filtering , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[25]  Zhenyu Li,et al.  FFT and convolution algorithms on DSP microprocessors , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[26]  Didier Demigny,et al.  Efficient ASIC and FPGA implementations of IIR filters for real time edge detection , 1997, Proceedings of International Conference on Image Processing.

[27]  J. Selinummi,et al.  Simulating fluorescent microscope images of cell populations , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[28]  Richard G. Shoup Parameterized convolution filtering in an FPGA , 1994 .

[29]  Abbes Amira,et al.  FPGA implementations of fast fourier transforms for real-time signal and image processing , 2003, Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798).

[30]  R. Bracewell Fourier Analysis and Imaging , 2004 .

[31]  David Svoboda,et al.  Convolution of large 3D images on GPU and its decomposition , 2011, EURASIP J. Adv. Signal Process..

[32]  Dana H. Ballard,et al.  Generalizing the Hough transform to detect arbitrary shapes , 1981, Pattern Recognit..

[33]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[34]  Ronald N. Bracewell,et al.  The Fourier Transform and Its Applications , 1966 .

[35]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[36]  Michal Kozubek,et al.  Generation of digital phantoms of cell nuclei and simulation of image formation in 3D image cytometry , 2009, Cytometry. Part A : the journal of the International Society for Analytical Cytology.

[37]  G. Amdhal,et al.  Validity of the single processor approach to achieving large scale computing capabilities , 1967, AFIPS '67 (Spring).

[38]  Martin Cadík,et al.  FFT and Convolution Performance in Image Filtering on GPU , 2006, Tenth International Conference on Information Visualisation (IV'06).

[39]  Wen-mei W. Hwu,et al.  Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[40]  Alessandro Foi,et al.  Noise estimation and removal in MR imaging: The variance-stabilization approach , 2011, 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro.

[41]  Andrea Prati,et al.  Image convolution on FPGAs: the implementation of a multi-FPGA FIFO structure , 1998, Proceedings. 24th EUROMICRO Conference (Cat. No.98EX204).

[42]  Manuel Menezes de Oliveira Neto,et al.  Realistic real-time sound re-synthesis and processing for interactive virtual worlds , 2009, The Visual Computer.

[43]  Ramani Duraiswami,et al.  Canny edge detection on NVIDIA CUDA , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[44]  Steven G. Johnson,et al.  The Fastest Fourier Transform in the West , 1997 .

[45]  David Svoboda Efficient Computation of Convolution of Huge Images , 2011, ICIAP.

[46]  Javier Díaz,et al.  FPGA-based real-time optical-flow system , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[47]  Yan Gao,et al.  Recursive Implementation of LoG Filtering , 1997, Real Time Imaging.

[48]  A. W. M. van den Enden,et al.  Discrete Time Signal Processing , 1989 .

[49]  Sanjay Kadam Parallelization of Low-Level Computer Vision Algorithms on Clusters , 2008, 2008 Second Asia International Conference on Modelling & Simulation (AMS).

[50]  Wilhelm Burger,et al.  Digital Image Processing - An Algorithmic Introduction using Java , 2008, Texts in Computer Science.

[51]  Rachid Deriche,et al.  Using Canny's criteria to derive a recursively implemented optimal edge detector , 1987, International Journal of Computer Vision.

[52]  G. Ramos Roundoff error analysis of the fast Fourier transform , 1970 .

[53]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Dah-Jye Lee,et al.  FPGA-Based Embedded Motion Estimation Sensor , 2008, Int. J. Reconfigurable Comput..

[55]  P. J. Verveer,et al.  Computational and optical methods for improving resolution and signal quality in fluorescence microscopy , 1998 .

[56]  Donald Fraser,et al.  Array Permutation by Index-Digit Permutation , 1976, JACM.

[57]  Lucas J. van Vliet,et al.  Recursive implementation of the Gaussian filter , 1995, Signal Process..

[58]  Ishfaq Ahmad,et al.  An Efficient Parallel Algorithm for Computing the Gaussian Convolution of Multi-dimensional Image Data , 2004, The Journal of Supercomputing.

[59]  Gerhard Fettweis,et al.  Implementation of recursive digital filters into vector SIMD DSP architectures , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[60]  Koji Nakano,et al.  Efficient Canny Edge Detection Using a GPU , 2010, 2010 First International Conference on Networking and Computing.

[61]  Will R. Moore,et al.  Selected papers from the Oxford 1993 international workshop on field programmable logic and applications on More FPGAs , 1994 .

[62]  David Svoboda,et al.  GPU Optimization of Convolution for Large 3-D Real Images , 2012, ACIVS.