Unified Systolic-Like Architecture for DCT and DST Using Distributed Arithmetic

A common computing-core representation of the discrete cosine transform and discrete sine transform is derived and a reduced-complexity algorithm is developed for computation of the proposed computing-core. A parallel architecture based on the principle of distributed arithmetic is designed further for the computation of these transforms using the common-core algorithm. The proposed scheme not only leads to a systolic-like regular and modular hardware for computing these transforms, but also offers significant improvement in area-time efficiency over the existing structures. The structure proposed here is devoid of complicated input/output mapping and does not involve any complex control. Unlike the convolution-based structures, it does not restrict the transform length to be a prime or multiple of prime and can be utilized as a reusable core for cost-effective, memory-efficient, high-throughput implementation of either of these transforms

[1]  Sung Bum Pan,et al.  Unified systolic arrays for computation of the DCT/DST/DHT , 1997, IEEE Trans. Circuits Syst. Video Technol..

[2]  W. Siu,et al.  On the realization of discrete cosine transform using the distributed arithmetic , 1992 .

[3]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[4]  Doru-Florin Chiper A systolic array algorithm for an efficient unified memory-based implementation of the inverse discrete cosine and sine transforms , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[5]  Chein-Wei Jen,et al.  A memory efficient realization of cyclic convolution and its application to discrete cosine transform , 2003, Proceedings of the 2003 International Symposium on Circuits and Systems, 2003. ISCAS '03..

[6]  Wen-Hsien Fang,et al.  Unified Fully-Pipelined VLSI Implementations of the One-and Two-Dimensional Real Discrete Trigonometric Transforms , 1999 .

[7]  Jin-Gyun Chung,et al.  Efficient ROM size reduction for distributed arithmetic , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).

[8]  A. Gupta,et al.  A fast recursive algorithm for the discrete sine transform , 1990, IEEE Trans. Acoust. Speech Signal Process..

[9]  Thanos Stouraitis,et al.  A systolic array architecture for the discrete sine transform , 2002, IEEE Trans. Signal Process..

[10]  L. Naviner,et al.  Efficient implementation for high accuracy DCT processor based on FPGA , 1999, 42nd Midwest Symposium on Circuits and Systems (Cat. No.99CH36356).

[11]  Ephraim Feig,et al.  Fast algorithms for the discrete cosine transform , 1992, IEEE Trans. Signal Process..

[12]  Jiun-In Guo,et al.  A generalized architecture for the one-dimensional discrete cosine and sine transforms , 2001, IEEE Trans. Circuits Syst. Video Technol..

[13]  Chein-Wei Jen,et al.  The efficient memory-based VLSI array designs for DFT and DCT , 1992 .

[14]  N. Ahmed,et al.  Discrete Cosine Transform , 1996 .

[15]  Nicolas Demassieux,et al.  Optimization of real-time VLSI architectures for distributed arithmetic-based algorithms: application to HDTV filters , 1994, Proceedings of IEEE International Symposium on Circuits and Systems - ISCAS '94.

[16]  P.K. Meher Unified DA-based Parallel Architecture for Computing the DCT and the DST , 2005, 2005 5th International Conference on Information Communications & Signal Processing.

[17]  Peter Pirsch,et al.  VLSI architectures for video compression-a survey , 1995, Proc. IEEE.

[18]  Sanjit K. Mitra,et al.  Block implementation of adaptive digital filters , 1981 .

[19]  Ting Chen,et al.  VLSI implementation of a 16*16 discrete cosine transform , 1989 .

[20]  Earl E. Swartzlander,et al.  DCT Implementation with Distributed Arithmetic , 2001, IEEE Trans. Computers.

[21]  Hyesook Lim,et al.  A systolic array for 2-D DFT and 2-D DCT , 1994, Proceedings of IEEE International Conference on Application Specific Array Processors (ASSAP'94).

[22]  H. T. Kung Why systolic architectures? , 1982, Computer.

[23]  N. Rama Murthy,et al.  On the real-time computation of DFT and DCT through systolic architectures , 1994, IEEE Trans. Signal Process..

[24]  Thanos Stouraitis,et al.  Prime-factor DCT algorithms , 1995, IEEE Trans. Signal Process..

[25]  Anil K. Jain Fundamentals of Digital Image Processing , 2018, Control of Color Imaging Systems.

[26]  Thanos Stouraitis,et al.  Systolic algorithms and a memory-based design approach for a unified architecture for the computation of DCT/DST/IDCT/IDST , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[27]  A. Jain,et al.  A Fast Karhunen-Loeve Transform for a Class of Random Processes , 1976, IEEE Trans. Commun..

[28]  Keshab K. Parhi,et al.  A novel systolic array structure for DCT , 2005, IEEE Transactions on Circuits and Systems II: Express Briefs.

[29]  K. J. Ray Liu,et al.  Discrete-cosine/sine-transform based motion estimation , 1994, Proceedings of 1st International Conference on Image Processing.

[30]  Chein-Wei Jen,et al.  Unified array architecture for discrete cosine transform, sine transform and their inverses , 1995 .

[31]  Its'hak Dinstein,et al.  DCT/DST alternate-transform image coding , 1990, IEEE Trans. Commun..

[32]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[33]  Pramod Kumar Meher,et al.  3-dimensional systolic architecture for parallel VLSI implementation of the discrete cosine transform , 1996 .

[34]  Wen-Hsien Fang,et al.  An efficient unified systolic architecture for the computation of discrete trigonometric transforms , 1997, Proceedings of 1997 IEEE International Symposium on Circuits and Systems. Circuits and Systems in the Information Age ISCAS '97.