A Reconfigurable Hardware Architecture for Fractional Pixel Interpolation in High Efficiency Video Coding

We present a novel reconfigurable hardware architecture for interpolation filtering in high efficient video coding that adapts to run-time changes of the number of interpolation filter calls and thereby provides a high potential of energy efficiency. It employs a picture-based prediction scheme to estimate the number of interpolation filter calls at run-time by monitoring the group of pictures history based on video coding structure knowledge. Reconfigurable acceleration engines are developed that can adapt to different filter types. Dynamic composition of different instances of these engines enables different implementation versions with area versus throughput tradeoff. A run-time selection scheme determines the best implementation version for each picture based on the throughput requirements. Compared to state-of-the-art, our architecture reduces resource usage by 57% while supporting various throughputs and video resolutions.

[1]  Zhiyuan Li,et al.  Configuration prefetching techniques for partial reconfigurable coprocessor with relocation and defragmentation , 2002, FPGA '02.

[2]  Scott Hauck,et al.  Reconfigurable computing: a survey of systems and software , 2002, CSUR.

[3]  Liang-Gee Chen,et al.  Fully utilized and reusable architecture for fractional motion estimation of H.264/AVC , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Chen-Yi Lee,et al.  A new motion compensation design for H.264/AVC decoder , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[5]  Liang-Gee Chen,et al.  Level C+ data reuse scheme for motion estimation with corresponding coding orders , 2006, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Muhammad Shafique,et al.  An Optimized Application Architecture of the H.264 Video Encoder for Application Specific Platforms , 2007, 2007 IEEE/ACM/IFIP Workshop on Embedded Systems for Real-Time Multimedia.

[7]  Muhammad Shafique,et al.  Efficient Resource Utilization for an Extensible Processor Through Dynamic Instruction Set Adaptation , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  Yu Li,et al.  Bandwidth Optimized and High Performance Interpolation Architecture in Motion Compensation for H.264/AVC HDTV Decoder , 2008, J. Signal Process. Syst..

[9]  Bin Zhang,et al.  A multi-platform controller allowing for maximum Dynamic Partial Reconfiguration throughput , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[10]  Muhammad Shafique,et al.  A computation- and communication- infrastructure for modular special instructions in a dynamically reconfigurable processor , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[11]  Matt Klein,et al.  Power Consumption at 40 and 45 Nm , 2009 .

[12]  Liang Lu,et al.  Subpixel Interpolation Architecture for Multistandard Video Motion Estimation , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[13]  Muhammad Shafique,et al.  REMiS: Run-time energy minimization scheme in a reconfigurable processor with dynamic power-gated instruction set , 2009, 2009 IEEE/ACM International Conference on Computer-Aided Design - Digest of Technical Papers.

[14]  Muhammad Shafique,et al.  Selective instruction set muting for energy-aware adaptive processors , 2010, 2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[15]  Wayne Luk,et al.  Energy-Aware Optimisation for Run-Time Reconfiguration , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[16]  Muhammad Shafique,et al.  Optimizing the H.264/AVC Video Encoder Application Structure for Reconfigurable and Application-Specific Platforms , 2010, J. Signal Process. Syst..

[17]  Sergio Bampi,et al.  Run-time adaptive energy-aware Motion and Disparity Estimation in Multiview Video Coding , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[18]  Muhammad Shafique,et al.  Hardware/Software Architectures for Low-Power Embedded Multimedia Systems , 2011 .

[19]  Sergio Bampi,et al.  A reduced memory bandwidth and high throughput HDTV motion compensation decoder for H.264/AVC High 4:2:2 profile , 2011, Journal of Real-Time Image Processing.

[20]  Guilherme Corrêa,et al.  Performance and Computational Complexity Assessment of High-Efficiency Video Encoders , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[21]  Wen Gao,et al.  A comparison of fractional-pel interpolation filters in HEVC and H.264/AVC , 2012, 2012 Visual Communications and Image Processing.

[22]  Antti Hallapuro,et al.  Comparative Rate-Distortion-Complexity Analysis of HEVC and AVC Video Codecs , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[23]  Gary J. Sullivan,et al.  Comparison of the Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC) , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  David Flynn,et al.  HEVC Complexity and Implementation Analysis , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Jean-Pierre Deschamps,et al.  Partial Reconfiguration on Xilinx FPGAs , 2012 .

[27]  Satoshi Goto,et al.  An optimized MC interpolation architecture for HEVC , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[28]  Grzegorz Pastuszak,et al.  A Novel Intra Prediction Architecture for the Hardware HEVC Encoder , 2013, 2013 Euromicro Conference on Digital System Design.

[29]  Muhammad Usman Karim Khan,et al.  Hardware-software collaborative complexity reduction scheme for the emerging HEVC intra encoder , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[30]  Luis Nero Alves,et al.  A novel SAD architecture for variable block size motion estimation in HEVC video coding , 2013, 2013 International Symposium on System on Chip (SoC).

[31]  D. Franco,et al.  Low cost and high throughput FME interpolation for the HEVC emerging video coding standard , 2013, 2013 IEEE 4th Latin American Symposium on Circuits and Systems (LASCAS).

[32]  Nuno Roma,et al.  High performance multi-standard architecture for DCT computation in H.264/AVC High Profile and HEVC codecs , 2013, 2013 Conference on Design and Architectures for Signal and Image Processing.

[33]  Sergio Bampi,et al.  Energy-efficient memory hierarchy for Motion and Disparity Estimation in Multiview Video Coding , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[34]  Sergio Bampi,et al.  High-throughput interpolation hardware architecture with coarse-grained reconfigurable datapaths for HEVC , 2013, 2013 IEEE International Conference on Image Processing.

[35]  Ilker Hamzaoglu,et al.  A high performance deblocking filter hardware for High Efficiency Video Coding , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[36]  Mariusz H. Jakubowski,et al.  Adaptive Computationally Scalable Motion Estimation for the Hardware H.264/AVC Encoder , 2013, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Lu Yu,et al.  A hardware CABAC encoder for HEVC , 2013, 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013).

[38]  Mathias Wien,et al.  High Efficiency Video Coding: Coding Tools and Specification , 2014 .