Exploring pel decimation to trade off between energy and quality in video coding

This work investigates the trade-offs between energy and quality in video coding when pel decimation is applied. Realistic estimates for area and energy per block were obtained by simulating five different architectures specially designed to compute the Sum of Absolute Differences (SAD) for 4×4 pixel blocks. Among these architectures, one can be configured to operate with 1:1, 4:3, 2:1 or 4:1 sample ratios, whereas the rest are tailored to operate exclusively with each one of those ratios. The five VLSI architectures were logically synthesized for a 45 nm industrial standard cell library for a target frequency and also for the maximum achievable frequency. They were also simulated with 100 k input vectors obtained by using an H.264/AVC encoder running on one full HD (1080p) video sample. The obtained results show that by using the configurable architecture with full sampling, the best energy/block result was 3.54 pJ/block (60% better than the non-configurable with 7.08 pJ/block). The energy/block value can be further reduced until 1.34 pJ/block at the cost of 2.8% in PSNR, on average, and 14.1% in SSIM, on average.