A Low-power Pyramid Motion Estimation Engine for 4K@30fps Realtime HEVC Video Encoding

This paper presents the design and VLSI implementation of a pyramid block-matching motion estimation engine, which is consisted of cascaded Integer Motion Estimation (IME) and Fractional Motion Estimation (FME). The IME is further divided into cascaded 3-stage, quarter sub-sample search, half sub-sample search, and integer sample search, while FME is divided into cascaded 2-stage half-sample interpolation and quarter-sample interpolation. Global Motion Estimation (GME) is introduced to compensate drastic objects moving within limited search range of ±160 × ±96. We also employ a lossless compression algorithm based on pixel tiles to reduce DRAM bandwidth by 50%. The design is integrated into a 4K realtime HEVC video encoder and fabricated with TSMC 28nm technology. The total ME hardware costs are 1,094k gates and 75.5KB SRAM, which leads to reduction of 40% ∼ 55% and 64% ∼ 86% as compared with reference designs. The measured results show that our implementation is able to achieve 4096×2160@30fps real-time encoding when running at 350 MHz while consumes 47mW, and 0.18nJ/pixel of energy efficiency with power reduction of 18%.