A gradient coprocessor of optical flow: A hardware co-simulation using FPGA based on MAC-DA

Optical flow is one of the tools for video content analysis (VCA). The optical flow involves a gradient coprocessor as preprocessing to estimate partial derivative. The gradient processor consists of kernels (masks) to convolute each pixel of image sequences. In hardware implementation of the gradient into FPGA, a multiply accumulate (MAC) is an important technique that is useful for reducing the number of multipliers and adders in a convolution process. Finite impulse response (FIR) is one of methods usually used to implement MAC. The multiply accumulate based on FIR (MAC-FIR) still needs more resources and has less speed. In order to fulfill this lack, this paper presents an implementation of distributed arithmetic to MAC. Design and optimization of the multiply accumulate based on distributed arithmetic (MAC-DA) is illustrated in this paper. Comparisons in resources (area) and speeds between MAC-FIR and MAC-DA are also given.