Hardware-software codesign of a tightly-coupled coprocessor for video content analysis

Hardware acceleration is a popular method to boost performance in video processing applications. This paper shows how to accelerate such applications on a general-purpose CPU by means of a coprocessor that is tightly-coupled to the instruction pipeline. A method for efficient data transfer between CPU and coprocessor is developed, and the resulting data path architecture with optimum scheduling of operations is demonstrated. Based on this method, a coprocessor has been implemented in a Virtex-5 FPGA with embedded PowerPC to accelerate candidate operations of a video content analysis algorithm. Experimental results indicate that with a relatively small degree of parallelism, corresponding to modest hardware cost, the overall frame rate can be increased between 18 and 105 % depending on processing and application parameters.