Source adaptive software 2D iDCT with SIMD

This paper presents a fast two-dimensional inverse discrete cosine transform that adapts to compressed video source statistics to reduce execution time. iDCT algorithms for sparse blocks eliminate calculations for some zero coefficients and are implemented with quad-word parallel single-instruction-multiple-data (SIMD) multimedia instructions. It is observed that end-of-block marker value histograms vary little within single shots. An adaptive control mechanism is proposed that chooses the optimal set of iDCTs to prepare for an entire shot from its 1st frames (to reduce software overheads and penalties). This introduces no degradation of decoded video quality compared with a conventional SIMD 8/spl times/8 iDCT implemented with Intel MMX instructions. It is confirmed that execution time is reduced an additional 15% with Murata's method for 4 Mbps MPEG2 natural video. In comparison, execution time is reduced 22% with a modified version Murata's method, and by 35% with the new source adaptive method.

[1]  G.S. Moschytz,et al.  Practical fast 1-D DCT algorithms with 11 multiplications , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[2]  Y. Arai,et al.  A Fast DCT-SQ Scheme for Images , 1988 .

[3]  Hsieh S. Hou A fast recursive algorithm for computing the discrete cosine transform , 1987, IEEE Trans. Acoust. Speech Signal Process..

[4]  Teresa H. Meng,et al.  Statistical inverse discrete cosine transforms for image compression , 1994, Electronic Imaging.

[5]  Zhongde Wang,et al.  Pruning the fast discrete cosine transform , 1991, IEEE Trans. Commun..

[6]  Zhigang Chen,et al.  A fast degradation-free algorithm for DCT block extraction in the compressed domain , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  Masao Ikekawa,et al.  Fast 2D IDCT implementation with multimedia instructions for a software MPEG2 decoder , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[8]  Wen-Hsiung Chen,et al.  A Fast Computational Algorithm for the Discrete Cosine Transform , 1977, IEEE Trans. Commun..