Improving the energy efficiency of a low-area SATD hardware architecture using fine grain PDE

Video coding has become widespread through mobile devices. At the same time, the adopted resolutions have been enlarged, demanding more coding efficiency and motivating the development of the new state-of-the-art standard, High Efficiency Video Coding (HEVC). However, to achieve the required efficiency the new standard greatly increased the computational intensity. That, allied to real-time constraints on mobile devices, results in a need for dedicated hardware implementations. A large part of the increasing complexity came at the Motion Estimation (ME) in the prediction coding step using larger blocks and more candidates. In ME, similarity metrics are a centerpiece of the performance achieved, both on coding efficiency and required computations. Among the used metrics is the Sum of Absolute Transformed Differences (SATD). Such metric may be calculated using a Transpose Buffer (TB) which largely increases the architecture area as block sizes increase. Alternatively, a Linear Buffer (LB) can be used, which results in less area at the cost of more energy consumption. This paper evaluates the use of Partial Distortion Elimination (PDE) as a means to reduce the energy consumption penalty associated with LB architectures. Our experimental results show that using PDE along with the LB architecture provides up to 65.61% energy reduction while negligibly degrading area.

[1]  L. D. Baumert,et al.  The Search for Hadamard Matrices , 1963 .

[2]  Gary J. Sullivan,et al.  Overview of the High Efficiency Video Coding (HEVC) Standard , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[3]  José Luís Almada Güntzel,et al.  Coarse grain partial distortion elimination for Hadamard ME in HEVC , 2016, 2016 IEEE International Conference on Electronics, Circuits and Systems (ICECS).

[4]  Iain E. G. Richardson,et al.  H.264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia , 2003 .

[5]  Tokunbo Ogunfunmi,et al.  Efficient sub-pixel interpolation and low power VLSI architecture for fractional motion estimation in H.264/AVC , 2010, 2010 4th International Conference on Signal Processing and Communication Systems.

[6]  Yunsong Li,et al.  High-Throughput Power-Efficient VLSI Architecture of Fractional Motion Estimation for Ultra-HD HEVC Video Encoding , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[7]  Russell M. Mersereau,et al.  Fast algorithms for the estimation of motion vectors , 1999, IEEE Trans. Image Process..

[8]  José Luís Almada Güntzel,et al.  Energy-efficient SATD for beyond HEVC , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[9]  Antonio Ortega,et al.  Rate-distortion methods for image and video compression , 1998, IEEE Signal Process. Mag..

[10]  Joe Brewer,et al.  Kronecker products and matrix calculus in system theory , 1978 .

[11]  David Flynn,et al.  HEVC Complexity and Implementation Analysis , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[12]  Allen Gersho,et al.  Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.

[13]  V. Ralph Algazi,et al.  Unified Matrix Treatment of the Fast Walsh-Hadamard Transform , 1976, IEEE Transactions on Computers.

[14]  Gary J. Sullivan,et al.  Rate-distortion optimization for video compression , 1998, IEEE Signal Process. Mag..

[15]  Yang Song,et al.  Fast prediction mode decision with hadamard transform based rate-distortion cost estimation for HEVC intra coding , 2013, 2013 IEEE International Conference on Image Processing.