A Row-Parallel 8$\,\times\,$ 8 2-D DCT Architecture Using Algebraic Integer-Based Exact Computation

An algebraic integer (AI)-based time-multiplexed row-parallel architecture and two final reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8 × 8 2-D DCT that is entirely free of quantization errors in AI basis. As a result, the user-selectable accuracy for each of the coefficients in the FRS facilitates each of the 64 coefficients to have its precision set independently of others, avoiding the leakage of quantization noise between channels as is the case for published DCT designs. The proposed FRS uses two approaches based on: 1) optimized Dempster-Macleod multipliers, and 2) expansion factor scaling. This architecture enables low-noise high-dynamic range applications in digital video processing that requires full control of the finite-precision computation of the 2-D DCT. The proposed architectures and FRS techniques are experimentally verified and validated using hardware implementations that are physically realized and verified on field-programmable gate array (FPGA) chip. Six designs, for 4-bit and 8-bit input word sizes, using the two proposed FRS schemes, have been designed, simulated, physically implemented, and measured. The maximum clock rate and block rate achieved among 8-bit input designs are 307.787 MHz and 38.47 MHz, respectively, implying a pixel rate of 8 × 307.787≈2.462 GHz if eventually embedded in a real- time video-processing system. The equivalent frame rate is about 1187.35Hz for the image size of 1920 × 1080. All implementations are functional on a Xilinx Virtex-6 XC6VLX240T FPGA device.

[1]  Hari Kalva,et al.  Compression Independent Reversible Encryption for Privacy in Video Surveillance , 2009, EURASIP J. Inf. Secur..

[2]  Majid Ahmadi,et al.  The application of 2D algebraic integer encoding to a DCT IP core , 2003, The 3rd IEEE International Workshop on System-on-Chip for Real-Time Applications, 2003. Proceedings..

[3]  M.N.S. Swamy,et al.  Low-complexity 8×8 transform for image compression , 2008 .

[4]  Graham A. Jullien,et al.  Multiplication-free 8×8 2D DCT architecture using algebraic integer encoding , 2004 .

[5]  Carl McCrosky,et al.  Efficient hardware implementation of 8 × 8 integer cosine transforms for multiple video codecs , 2011, Journal of Real-Time Image Processing.

[6]  Masaaki Ikehara,et al.  Integer DCT Based on Direct-Lifting of DCT-IDCT for Lossless-to-Lossy Image Coding , 2010, IEEE Transactions on Image Processing.

[7]  Bruce Bennett,et al.  Emerging methodologies in encoding airborne sensor video and metadata , 2009, MILCOM 2009 - 2009 IEEE Military Communications Conference.

[8]  Khan A. Wahid,et al.  Lossless and Low-Power Image Compressor for Wireless Capsule Endoscopy , 2011, VLSI Design.

[9]  Anwar S. Dawood,et al.  On-board satellite image compression using reconfigurable FPGAs , 2002, 2002 IEEE International Conference on Field-Programmable Technology, 2002. (FPT). Proceedings..

[10]  Jian Wang,et al.  Hardware Design of Video Compression System in the UAV Based on the ARM Technology , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[11]  Haruhisa Shimoda,et al.  An evaluation of JPEG compression for on-line satellite images transmission , 1993, Proceedings of IGARSS '93 - IEEE International Geoscience and Remote Sensing Symposium.

[12]  Gerlind Plonka,et al.  A global method for invertible integer DCT and integer wavelet algorithms , 2004 .

[13]  Enrico Magli,et al.  Energy consumption and image quality in wireless video-surveillance networks , 2002, The 13th IEEE International Symposium on Personal, Indoor and Mobile Radio Communications.

[14]  P.K. Meher Unified DA-based Parallel Architecture for Computing the DCT and the DST , 2005, 2005 5th International Conference on Information Communications & Signal Processing.

[15]  Seok-Bum Ko,et al.  Area and Power Efficient Video Compressor for Endoscopic Capsules , 2008 .

[16]  B. Bennett,et al.  Operational concepts of MPEG-4 H.264 for tactical DoD applications , 2005, MILCOM 2005 - 2005 IEEE Military Communications Conference.

[17]  Magdy A. Bayoumi,et al.  NEDA: a low-power high-performance DCT architecture , 2006, IEEE Transactions on Signal Processing.

[18]  A. Dempster,et al.  Multiplication by an integer using minimum adders , 1994 .

[19]  Arjuna Madanayake,et al.  Algebraic integer based 8×8 2-D DCT architecture for digital video processing , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[20]  Vladimir Britanak,et al.  CHAPTER 1 – Discrete Cosine and Sine Transforms , 2006 .

[21]  Robert L. Stevenson,et al.  DCT quantization noise in compressed images , 2001, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  A NikaraJari,et al.  Discrete cosine and sine transforms , 2006 .

[23]  Chin-Teng Lin,et al.  Cost-Effective Triple-Mode Reconfigurable Pipeline FFT/IFFT/2-D DCT Processor , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Huei-Yung Lin,et al.  High dynamic range imaging for stereoscopic scene representation , 2009, 2009 16th IEEE International Conference on Image Processing (ICIP).

[25]  Hai Huang,et al.  A novel VLSI linear array for 2-D DCT/IDCT , 2010, 2010 3rd International Congress on Image and Signal Processing.

[26]  M. N. Shanmukha Swamy,et al.  New Systolic Algorithm and Array Architecture for Prime-Length Discrete Sine Transform , 2007, IEEE Transactions on Circuits and Systems II: Express Briefs.

[27]  M. Pohst Computational Algebraic Number Theory , 1993 .

[28]  Jochen Schiewe,et al.  EFFECT OF LOSSY DATA COMPRESSION TECHNIQUES ON GEOMETRY AND INFORMATION CONTENT OF SATELLITE IMAGERY , 1998 .

[29]  Yeong-Kang Lai,et al.  A high-speed 2-D transform architecture with unique kernel for multi-standard video applications , 2008, 2008 IEEE International Symposium on Circuits and Systems.

[30]  Borko Furht,et al.  Real-time video compression - techniques and algorithms , 1997, The Kluwer international series in engineering and computer science.

[31]  E. Wright,et al.  An Introduction to the Theory of Numbers , 1939 .

[32]  Tsin-Yuan Chang,et al.  A High Performance Video Transform Engine by Using Space-Time Scheduling Strategy , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[33]  A. Oppenheim,et al.  Effects of finite register length in digital filtering and the fast Fourier transform , 1972 .

[34]  A. Prasad Vinod,et al.  A 2-D Systolic Array for High-Throughput Computation of 2-D Discrete Fourier Transform , 2006, APCCAS 2006 - 2006 IEEE Asia Pacific Conference on Circuits and Systems.

[35]  Graham A. Jullien,et al.  Error-free computation of 8/spl times/8 2D DCT and IDCT using two-dimensional algebraic integer quantization , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[36]  O. Gustafsson,et al.  Simplified Design of Constant Coefficient Multipliers , 2006 .

[37]  Majid Ahmadi,et al.  A low-power DCT IP core based on 2D algebraic integer encoding , 2004, 2004 IEEE International Symposium on Circuits and Systems (IEEE Cat. No.04CH37512).

[38]  John H. Cozzens,et al.  Computing the discrete Fourier transform using residue number systems in a ring of algebraic integers , 1985, IEEE Trans. Inf. Theory.

[39]  K. Rajan,et al.  Hardware implementation of 4×4 DCT/quantization block using multiplication and error-free algorithm , 2009, TENCON 2009 - 2009 IEEE Region 10 Conference.

[40]  Fábio M. Bayer,et al.  A DCT Approximation for Image Compression , 2011, IEEE Signal Processing Letters.

[41]  Ian F. Akyildiz,et al.  A survey on wireless multimedia sensor networks , 2007, Comput. Networks.

[42]  Gianluca Palermo,et al.  A Pipelined Fast 2D-DCT Accelerator for FPGA-based SoCs , 2007, IEEE Computer Society Annual Symposium on VLSI (ISVLSI '07).

[43]  Giovanni Ramponi,et al.  Video Enhancement and Dynamic Range Control of HDR Sequences for Automotive Applications , 2007, EURASIP J. Adv. Signal Process..

[44]  Khan Wahid,et al.  ON THE ERROR-FREE COMPUTATION OF FAST COSINE TRANSFORM , 2006 .

[45]  Graham A. Jullien,et al.  An efficient technique for error-free algebraic-integer encoding for high performance implementation of the DCT and IDCT , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[46]  Graham A. Jullien,et al.  Eisenstein residue number system with applications to DSP , 1999 .

[47]  V. Dimitrov,et al.  Systolic implementation of real-valued discrete transforms via algebraic integer quantization , 2001 .

[48]  Thanos Stouraitis,et al.  Systolic algorithms and a memory-based design approach for a unified architecture for the computation of DCT/DST/IDCT/IDST , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[49]  Wen-Chung Kao High Dynamic Range Imaging by Fusing Multiple Raw Images and Tone Reproduction , 2008, IEEE Transactions on Consumer Electronics.

[50]  Albert J. Ahumada,et al.  The visibility of DCT quantization noise , 1993 .

[51]  Graham A. Jullien,et al.  A new DCT algorithm based on encoding algebraic integers , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[52]  John H. Cozzens,et al.  Range and error analysis for a fast Fourier transform computed over Z[{omega}] , 1987, IEEE Trans. Inf. Theory.

[53]  Khan A. Wahid An Efficient IEEE-Compliant 8*8 Inv-DCT Architecture with 24 Adders , 2011 .

[54]  Hsi-Chin Hsin,et al.  An Efficient VLSI Linear Array for DCT/IDCT Using Subband Decomposition Algorithm , 2010 .

[55]  U. Meyer-Base,et al.  Optimal algebraic integer implementation with application to complex frequency sampling filters , 2001 .

[56]  Enrico Magli,et al.  Image compression practices and standards for geospatial information systems , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[57]  Pramod Kumar Meher,et al.  3-dimensional systolic architecture for parallel VLSI implementation of the discrete cosine transform , 1996 .

[58]  M. Omair Ahmad,et al.  A low-complexity parametric transform for image compression , 2011, 2011 IEEE International Symposium of Circuits and Systems (ISCAS).

[59]  Jürgen Götze,et al.  Low-complexity multi-purpose IP Core for quantized Discrete Cosine and integer transform , 2009, 2009 IEEE International Symposium on Circuits and Systems.

[60]  Godfrey H. Hardy,et al.  An introduction to the theory of numbers (5. ed.) , 1995 .

[61]  Katja Ihsberner,et al.  Roundoff error analysis of fast DCT algorithms in fixed point arithmetic , 2007, Numerical Algorithms.

[62]  Y. Arai,et al.  A Fast DCT-SQ Scheme for Images , 1988 .

[63]  Alan N. Willson,et al.  A 100 MHz 2-D 8×8 DCT/IDCT processor for HDTV applications , 1995, IEEE Trans. Circuits Syst. Video Technol..

[64]  Guiran Chang,et al.  An Outlier Detection Based DDoS Defense Method , 2009, 2009 International Symposium on Computer Network and Multimedia Technology.

[65]  A. Dempster,et al.  Constant integer multiplication using minimum adders , 1994 .

[66]  Richard E. Blahut,et al.  Fast Algorithms for Signal Processing: Acknowledgments , 2010 .

[67]  Bernhard Rinner,et al.  Real-time video analysis on an embedded smart camera for traffic surveillance , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[68]  Pramod Kumar Meher Highly concurrent reduced-complexity 2-D systolic array for discrete Fourier transform , 2006, IEEE Signal Processing Letters.

[69]  P.K. Meher,et al.  A new convolutional formulation of discrete cosine transform for systolic implementation , 2007, 2007 6th International Conference on Information, Communications & Signal Processing.

[70]  Franklin T. Luk,et al.  Fast Algorithms for Signal Processing , 1990 .

[71]  Jiun-In Guo,et al.  An efficient 2-D DCT/IDCT core design using cyclic convolution and adder-based realization , 2004, IEEE Transactions on Circuits and Systems for Video Technology.