Parallel-pipeline 8/spl times/8 forward 2-D ICT processor chip for image coding

The Integer Cosine Transform (ICT) presents a performance close to Discrete Cosine Transform (DCT) with a reduced computational complexity. The ICT kernel is integer-based, so computation only requires adding and shifting operations. This work presents a parallel-pipelined architecture of an 8/spl times/8 forward two-dimensional (2-D) ICT(10,9,6,2,3,1) processor for image encoding. A fully pipelined row-column decomposition method based on two one-dimensional (1-D) ICTs and a transpose buffer based on D-type flip-flops is used. The main characteristics of 1-D ICT architecture are high throughput, parallel processing, reduced internal storage, and 100% efficiency in computational elements. The arithmetic units are distributed and are made up of adders/subtractors operating at half the frequency of the input data rate. In this transform, the truncation and rounding errors are only introduced at the final normalization stage. The normalization coefficient word length of 18-bit (13-bit effective) has been established using the requirements of IEEE standard 1180-1990 as a reference. The processor has been implemented using standard cell design methodology in 0.35-/spl mu/m CMOS technology, measures 9.3 mm/sup 2/, and contains 12.4 k gates. The maximum frequency is 300 MHz with a latency of 214 cycles (260 cycles with normalization).

[1]  Oliver Chiu-sing Choy,et al.  A self-timed ICT chip for image coding , 1999, IEEE Trans. Circuits Syst. Video Technol..

[2]  Gustavo A. Ruiz,et al.  Parallel-pipelined architecture for 2-D ICT VLSI implementation , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[3]  W. K. Cham,et al.  A comparison of error behaviour in the implementation of the DCT and the ICT , 1990, IEEE TENCON'90: 1990 IEEE Region 10 Conference on Computer and Communication Systems. Conference Proceedings.

[4]  Liang-Gee Chen,et al.  A low power 2D DCT chip design using direct 2D algorithm , 1998, Proceedings of 1998 Asia and South Pacific Design Automation Conference.

[5]  K. R. Rao,et al.  Techniques and Standards for Image, Video, and Audio Coding , 1996 .

[6]  Gregor Rozinaj,et al.  Approximation of DCT without multiplication in JPEG , 1996, Proceedings of Third International Conference on Electronics, Circuits, and Systems.

[7]  Chao-Ho Chen,et al.  A cost-effective 8×8 2-D IDCT core processor with folded architecture , 1999, IEEE Trans. Consumer Electron..

[8]  T. Fujita,et al.  A 0.9 V 150 MHz 10 mW 4 mm/sup 2/ 2-D discrete cosine transform core processor with variable-threshold-voltage scheme , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[9]  P. Yip,et al.  Discrete Cosine Transform: Algorithms, Advantages, Applications , 1990 .

[10]  Gregor Rozinaj,et al.  New approach of fast ICT and MICT algorithms development , 1996, Proceedings of Third International Conference on Electronics, Circuits, and Systems.

[11]  Neri Merhav,et al.  A multiplication-free approximate algorithm for the inverse discrete cosine transform , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[12]  Han-Jin Cho,et al.  A design of 2-D DCT/IDCT for real-time video applications , 1999, ICVC '99. 6th International Conference on VLSI and CAD (Cat. No.99EX361).

[13]  Yonghong Zeng,et al.  Integer DCTs and fast algorithms , 2001, IEEE Trans. Signal Process..

[14]  Max H. M. Costa,et al.  A Simplified Integer Cosine Transform and Its Application in Image Compression , 1994 .

[15]  Shen-Fu Hsiao,et al.  Parallel, Pipelined and Folded Architectures for Computation of 1-D and 2-D DCT in Image and Video Codec , 2001, J. VLSI Signal Process..

[16]  Ieee Standards Board,et al.  IEEE standard specifications for the implementations of 8x8 inverse discrete cosine transform , 1991 .

[17]  Kai-Kuang Ma,et al.  On the computation of two-dimensional DCT , 2000, IEEE Trans. Signal Process..

[18]  Anantha P. Chandrakasan,et al.  A low-power IDCT macrocell for MPEG-2 MP@ML exploiting data distribution properties for minimal activity , 1999 .

[19]  Nagarajan Ranganathan,et al.  JAGUAR: a fully pipelined VLSI architecture for JPEG image compression standard , 1995, Proc. IEEE.

[20]  W.-K. Development of integer cosine transforms by the principle of dyadic symmetry , 2004 .

[21]  Kuo-Hsing Cheng,et al.  The design and implementation of DCT/IDCT chip with novel architecture , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[22]  Angel M. Buron,et al.  Integer cosine transform chip design for image compression , 2003, SPIE Microtechnologies.

[23]  Chiu-Sing Choy,et al.  A 2-D integer cosine transform chip set and its applications , 1992 .

[24]  M. C. Chen,et al.  Common acoustical-poles/zeros modeling for 3D sound processing , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[25]  T. Tran,et al.  The binDCT: fast multiplierless approximation of the DCT , 2000, IEEE Signal Processing Letters.

[26]  P. C. Jain,et al.  VLSI implementation of two-dimensional DCT processor in real time for video codec , 1992 .

[27]  Yiu-Tong Chan,et al.  An order-16 integer cosine transform , 1991, IEEE Trans. Signal Process..

[28]  Jen-Shiun Chiang,et al.  A high throughput 2-dimensional DCT/IDCT architecture for real-time image and video system , 2001, ICECS 2001. 8th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.01EX483).

[29]  藤田 哲也,et al.  A 0.9V 150MHz 10mW 4mm^2 2-D Discrete Cosine Transform Core Processor with Variable Threshold-Voltage (VT) Scheme , 1996 .

[30]  Chin-Liang Wang,et al.  High-throughput VLSI architectures for the 1-D and 2-D discrete cosine transforms , 1995, IEEE Trans. Circuits Syst. Video Technol..

[31]  Truong Q. Nguyen,et al.  Video compression using integer DCT , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[32]  Trac D. Tran,et al.  Fast multiplierless approximations of the DCT with the lifting scheme , 2001, IEEE Trans. Signal Process..

[33]  Seehyun Kim,et al.  Optimum wordlength determination of 8/spl times/8 IDCT architectures conforming to the IEEE standard specifications , 1995, Conference Record of The Twenty-Ninth Asilomar Conference on Signals, Systems and Computers.

[34]  H. T. Kung,et al.  A Regular Layout for Parallel Adders , 1982, IEEE Transactions on Computers.

[35]  Liang-Gee Chen,et al.  A Low Power 8 x 8 Direct 2-D DCT Chip Design , 2000, J. VLSI Signal Process..