论文信息 - High-speed FFT processors based on redundant number systems

High-speed FFT processors based on redundant number systems

Fast Fourier Transform (FFT) processors, having a significant impact on the performance of communication systems, have been a hot topic of research for many years. FFT function consists of consecutive multiply-add operations over complex numbers, dubbed as butterfly units. Use of redundant number systems is a way of increasing the speed of FFT coprocessors. It eliminates carry-propagation and hence permits latency reduction of each stage of the pipelined FFT architecture. This paper proposes a high-speed FFT processor using the devised fused-dot-product-add (FDPA) unit, to compute AB ± CD ± E, based on Binary-Signed-Digit (BSD) representation. Three-operand BSD adder and BSD constant multiplier are the constituents of the proposed FDPA unit. A carry-limited BSD adder is proposed and used in the three-operand adder and in the parallel BSD multiplier, so as to improve the speed of the FDPA unit. Moreover, modified-booth encoding is used to accelerate the BSD multiplier. Synthesis results show that the proposed design is about two times faster than the best previous work; but at cost of more area/power consumption.

Amir Kaivani | Seok-Bum Ko

[1] Amir Kaivani,et al. Improved design of high-frequency sequential decimal multipliers , 2014 .

[2] Behrooz Parhami,et al. Generalized Signed-Digit Number Systems: A Unifying Framework for Redundant Number Representations , 1990, IEEE Trans. Computers.

[3] Behrooz Parhami. Tight Upper Bounds on the Minimum Precision Required of the Divisor and the Partial Remainder in High-Radix Division , 2003, IEEE Trans. Computers.

[4] Luigi Ciminiera,et al. Over Redundant Digit Sets and the Design of Digit-by-Digit Division Units , 1994, IEEE Trans. Computers.

[5] Silvia M. Müller,et al. The IBM zEnterprise-196 Decimal Floating-Point Accelerator , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[6] Earl E. Swartzlander,et al. FFT Implementation with Fused Floating-Point Operations , 2012, IEEE Transactions on Computers.

[7] Michael J. Schulte,et al. A high-frequency decimal multiplier , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[8] Fred J. Taylor,et al. A Radix-4 FFT Using Complex RNS Arithmetic , 1985, IEEE Transactions on Computers.

[9] M. Bayoumi,et al. Algorithms for Energy-Efficient Query-Reduction in Wireless Sensor Networks , 2007, 2006 International Workshop on Computer Architecture for Machine Perception and Sensing.

[10] Paolo Montuschi,et al. A radix-10 SRT divider based on alternative BCD codings , 2007, 2007 25th International Conference on Computer Design.

[11] P. Duhamel,et al. `Split radix' FFT algorithm , 1984 .

[12] Mohammad Ghodsi,et al. Weighted two-valued digit-set encodings: unifying efficient hardware representation schemes for redundant number systems , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[13] David E. Shaw,et al. Tight Certification Techniques for Digit-by-Rounding Algorithms with Application to a New 1/sqrt(x) Design , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[14] R. K. Richards,et al. Arithmetic operations in digital computers , 2013 .

[15] Amir Kaivani,et al. Decimal Division Algorithms: The Issue of Partial Remainders , 2013, J. Signal Process. Syst..

[16] Milos D. Ercegovac,et al. A higher-radix division with simple selection of quotient digits , 1983, 1983 IEEE 6th Symposium on Computer Arithmetic (ARITH).

[17] Ghassem Jaberipur,et al. A Family of High Radix Signed Digit Adders , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[18] Tomás Lang,et al. A Radix-10 Digit-Recurrence Division Unit: Algorithm and Architecture , 2007, IEEE Transactions on Computers.

[19] Li Chen,et al. High-frequency sequential decimal multipliers , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[20] B. Parhami,et al. Precision requirements for quotient digit selection in high-radix division , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[21] E. Dubois,et al. A new algorithm for the radix-3 FFT , 1978 .

[22] Peter Kornerup. Digit selection for SRT division and square root , 2005, IEEE Transactions on Computers.

[23] Javier D. Bruguera,et al. Implementation of the FFT butterfly with redundant arithmetic , 1996 .

[24] Amir Kaivani,et al. Decimal SRT Square Root: Algorithm and Architecture , 2013, Circuits Syst. Signal Process..

[25] J. Tukey,et al. An algorithm for the machine calculation of complex Fourier series , 1965 .

[26] E. Swartzlander,et al. Floating-point implementation of complex multiplication , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[27] Mit Press,et al. A Linear Filtering Approach to the Computation of the Discrete Fourier Transform , 1969 .

[28] Julio Villalba,et al. Computation of Decimal Transcendental Functions Using the CORDIC Algorithm , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.

[29] Michael J. Schulte,et al. Decimal multiplication via carry-save addition , 2003, Proceedings IEEE International Conference on Application-Specific Systems, Architectures, and Processors. ASAP 2003.

[30] Amir Kaivani,et al. Decimal signed digit addition using stored transfer encoding , 2013, 2013 26th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE).

[31] Sau-Gee Chen,et al. A High-Throughput Radix-16 FFT Processor With Parallel and Normal Input/Output Ordering for IEEE 802.15.3c Systems , 2012, IEEE Transactions on Circuits and Systems I: Regular Papers.

[32] E.E. Swartzlander,et al. Fused floating-point arithmetic for DSP , 2008, 2008 42nd Asilomar Conference on Signals, Systems and Computers.

[33] Milos D. Ercegovac,et al. Digital Arithmetic , 2003, Wiley Encyclopedia of Computer Science and Engineering.

[34] Paolo Montuschi,et al. A New Family of High.Performance Parallel Decimal Multipliers , 2007, 18th IEEE Symposium on Computer Arithmetic (ARITH '07).

[35] Michael J. Flynn,et al. The case for a redundant format in floating point arithmetic , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[36] Amir Kaivani,et al. Fully redundant decimal addition and subtraction using stored-unibit encoding , 2010, Integr..

[37] Peter Kornerup. Correcting the Normalization Shift of Redundant Binary Representations , 2009, IEEE Transactions on Computers.

[38] Earl E. Swartzlander,et al. A floating-point fused FFT butterfly arithmetic unit with Merged Multiple-Constant Multipliers , 2011, 2011 Conference Record of the Forty Fifth Asilomar Conference on Signals, Systems and Computers (ASILOMAR).

[39] Antonin Svoboda. Decimal Adder with Signed Digit Arithmetic , 1969, IEEE Transactions on Computers.

[40] Amir Kaivani,et al. Improving the speed of decimal division , 2011, IET Comput. Digit. Tech..

[41] Amir Kaivani,et al. Area Efficient Sequential Decimal Fixed-point Multiplier , 2014, J. Signal Process. Syst..

[42] B. Parhami,et al. A class of stored-transfer representations for redundant number systems , 2001, Conference Record of Thirty-Fifth Asilomar Conference on Signals, Systems and Computers (Cat.No.01CH37256).

[43] David E. Shaw,et al. Radix-8 Digit-by-Rounding: Achieving High-Performance Reciprocals, Square Roots, and Reciprocal Square Roots , 2011, 2011 IEEE 20th Symposium on Computer Arithmetic.

[44] Amir Kaivani,et al. Decimal CORDIC Rotation based on Selection by Rounding: Algorithm and Architecture , 2011, Comput. J..

[45] B. Parhami,et al. Weighted bit-set encodings for redundant digit sets: theory and applications , 2002, Conference Record of the Thirty-Sixth Asilomar Conference on Signals, Systems and Computers, 2002..

[46] Michael J. Schulte,et al. Decimal floating-point square root using Newton-Raphson iteration , 2005, 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors (ASAP'05).

[47] Amir Kaivani,et al. Binary-coded decimal digit multipliers , 2007, IET Comput. Digit. Tech..

[48] Eric M. Schwarz,et al. IBM POWER6 accelerators: VMX and DFU , 2007, IBM J. Res. Dev..

[49] Alexandre F. Tenca,et al. Multi-operand Floating-Point Addition , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.

[50] R. Singleton. An algorithm for computing the mixed radix fast Fourier transform , 1969 .

[51] Michael F. Cowlishaw,et al. Decimal floating-point: algorism for computers , 2003, Proceedings 2003 16th IEEE Symposium on Computer Arithmetic.

[52] Vojin G. Oklobdzija,et al. An algorithmic and novel design of a leading zero detector circuit: comparison with logic synthesis , 1994, IEEE Trans. Very Large Scale Integr. Syst..

[53] Behrooz Parhami,et al. On the Implementation of Arithmetic Support Functions for Generalized Signed-Digit Number Systems , 1993, IEEE Trans. Computers.

[54] S. Winograd. On computing the Discrete Fourier Transform. , 1976, Proceedings of the National Academy of Sciences of the United States of America.

[55] Guy Even,et al. An IEEE Compliant Floating-Point Adder that Conforms with the Pipelined Packet-Forwarding Paradigm , 2000, IEEE Trans. Computers.

[56] Amir Kaivani,et al. Floating-Point Butterfly Architecture Based on Binary Signed-Digit Representation , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[57] Behrooz Parhami,et al. Computer arithmetic - algorithms and hardware designs , 1999 .

[58] Milos D. Ercegovac,et al. Design and FPGA implementation of radix-10 algorithm for square root with limited precision primitives , 2009, 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers.

[59] Paolo Montuschi,et al. Improved Design of High-Performance Parallel Decimal Multipliers , 2010, IEEE Transactions on Computers.

[60] Eric M. Schwarz,et al. Decimal floating-point support on the IBM System z10 processor , 2009, IBM J. Res. Dev..

[61] David Y. Y. Yun,et al. RBCD: redundant binary coded decimal adder , 1989 .

[62] Earl E. Swartzlander,et al. Improved Architectures for a Floating-Point Fused Dot Product Unit , 2013, 2013 IEEE 21st Symposium on Computer Arithmetic.

[63] Michael J. Schulte,et al. Decimal multiplication with efficient partial product generation , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).

[64] L. Bluestein. A linear filtering approach to the computation of discrete Fourier transform , 1970 .

[65] Eric M. Schwarz,et al. Power6 Decimal Divide , 2007, 2007 IEEE International Conf. on Application-specific Systems, Architectures and Processors (ASAP).

[66] Gao Deyuan,et al. Three-Operand Floating-Point Adder , 2012, 2012 IEEE 12th International Conference on Computer and Information Technology.

[67] Ghassem Jaberipur,et al. Fully Redundant Decimal Arithmetic , 2009, 2009 19th IEEE Symposium on Computer Arithmetic.