Power Modeling and Efficient FPGA Implementation of FHT for Signal Processing

Fast Hadamard transform (FHT) belongs to the family of discrete orthogonal transforms and is used widely in image and signal processing applications. In this paper, a parameterizable and scalable architecture for FHT with time and area complexities of O(2(W+1)) and O(2N2), respectively, has been proposed, where W and N are the word and vector lengths. A novel algorithmic transformation for the FHT based on sparse matrix factorization and distributed arithmetic (DA) principles has been presented. The architecture has been parallelized and pipelined in order to achieve high throughput rates. Efficient and optimized field-programmable gate array implementation of the proposed architecture that yield excellent performance metrics has been analyzed in detail. Additionally, a functional level power analysis and modeling methodology has been proposed to characterize the various power and energy metrics of the cores in terms of system parameters and design variables. The mathematical models that have been derived provide quick presilicon estimate of power and energy measures, allowing intelligent tradeoffs when incorporating the developed cores as subblocks in hardware-based image and video processing systems

[1]  T. Sasao,et al.  Unified algorithm to generate Walsh functions in four different orderings and its programmable hardware implementations , 2005 .

[2]  S. Kung,et al.  VLSI Array processors , 1985, IEEE ASSP Magazine.

[3]  Long-Wen Chang,et al.  A bit level systolic array for Walsh-Hadamard transforms , 1993, Signal Process..

[4]  Abbes Amira,et al.  Design of efficient architectures for discrete orthogonal transforms using bit level systolic structures , 2002 .

[5]  D. Shah,et al.  The Role of Distributed Arithmetic in FPGA-based Signal Processing , 1996 .

[6]  Ali Al-Haj Fast Discrete Wavelet Transformation Using FPGAs and Distributed Arithmetic , 2003 .

[7]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[8]  Abbes Amira,et al.  An FPGA based parametrisable system for Discrete Orthogonal Transforms implementation , 2002, 2002 11th European Signal Processing Conference.

[9]  Tsutomu Sasao,et al.  Hardware to compute Walsh coefficients , 2005, 35th International Symposium on Multiple-Valued Logic (ISMVL'05).

[10]  Abbes Amira,et al.  Novel FPGA implementations of Walsh-Hadamard transforms for signal processing , 2001 .

[11]  A. B. Premkumar,et al.  A modular approach to the computation of convolution sum using distributed arithmetic principles , 1999 .

[12]  Chein-Wei Jen,et al.  New distributed arithmetic algorithm and its application to IDCT , 1999 .

[13]  Peter Fuchs,et al.  DESIGN OF A RISC MICROCONTROLLER CORE IN 48 HOURS , 2001 .

[14]  Sanat Kamal Bahl Design and prototyping a Fast Hadamard Transformer for WCDMA , 2003, 14th IEEE International Workshop on Rapid Systems Prototyping, 2003. Proceedings..

[15]  S.A. White,et al.  Applications of distributed arithmetic to digital signal processing: a tutorial review , 1989, IEEE ASSP Magazine.

[16]  Reza Hashemian,et al.  A new gate image encoder; algorithm, design and implementation , 1999, 42nd Midwest Symposium on Circuits and Systems (Cat. No.99CH36356).

[17]  E. Macii,et al.  High-level Power Modeling, Estimation, And Optimization , 1997, Proceedings of the 34th Design Automation Conference.

[18]  B. Sankur,et al.  Applications of Walsh and related functions , 1986 .

[19]  B. E. Wells,et al.  Handel-C for rapid prototyping of VLSI coprocessors for real time systems , 2002, Proceedings of the Thirty-Fourth Southeastern Symposium on System Theory (Cat. No.02EX540).

[20]  Odysseas G. Koufopavlou,et al.  Hardware Implementation of Bluetooth Security , 2003, IEEE Pervasive Comput..

[21]  Radu Marculescu,et al.  Adaptive models for input data compaction for power simulators , 1997, Proceedings of ASP-DAC '97: Asia and South Pacific Design Automation Conference.

[22]  Krishna M. Sivalingam,et al.  Scheduling Multimedia Services in a Low-Power MAC for Wireless and Mobile ATM Networks , 1999, IEEE Trans. Multim..

[23]  Abbes Amira,et al.  Accelerating colour space conversion on reconfigurable hardware , 2005, Image Vis. Comput..

[24]  Chin-Liang Wang,et al.  A DHT-based FFT/IFFT processor for VDSL transceivers , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[25]  Martin Lukac,et al.  Evolving quantum circuits using genetic algorithm , 2002, Proceedings 2002 NASA/DoD Conference on Evolvable Hardware.

[26]  S. S. Nayak,et al.  High throughput VLSI implementation of discrete orthogonal transforms using bit-level vector-matrix multiplier , 1999 .

[27]  Moon Ho Lee,et al.  Fast Hadamard transform based on a simple matrix factorization , 1986, IEEE Trans. Acoust. Speech Signal Process..

[28]  Gregory Ray Goslin,et al.  Guide to using field programmable gate arrays (FPGAs) for application-specific digital signal processing performance , 1996, Other Conferences.

[29]  Farid N. Najm,et al.  Transition density: a new measure of activity in digital circuits , 1993, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..