An Ultra-Area-Efficient 1024-Point In-Memory FFT Processor

Current computation architectures rely on more processor-centric design principles. On the other hand, the inevitable increase in the amount of data that applications need forces researchers to design novel processor architectures that are more data-centric. By following this principle, this study proposes an area-efficient Fast Fourier Transform (FFT) processor through in-memory computing. The proposed architecture occupies the smallest footprint of around 0.1 mm2 inside its class together with acceptable power efficiency. According to the results, the processor exhibits the highest area efficiency (FFT/s/area) among the existing FFT processors in the current literature.

[1]  Eisse Mensink,et al.  A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[2]  Irving John Good,et al.  The Interaction Algorithm and Practical Fourier Analysis , 1958 .

[3]  Bing Chen,et al.  Efficient in-memory computing architecture based on crossbar arrays , 2015, 2015 IEEE International Electron Devices Meeting (IEDM).

[4]  R. Singleton,et al.  A method for computing the fast Fourier transform with auxiliary memory and limited high-speed storage , 1967, IEEE Transactions on Audio and Electroacoustics.

[5]  Rachata Ausavarungnirun,et al.  RowClone: Fast and energy-efficient in-DRAM bulk data copy and initialization , 2013, 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  M. McDougall,et al.  64‐channel array coil for single echo acquisition magnetic resonance imaging , 2005, Magnetic resonance in medicine.

[7]  Sparsh Mittal,et al.  A Survey of Techniques for Approximate Computing , 2016, ACM Comput. Surv..

[8]  Tajana Simunic,et al.  LUPIS: Latch-up based ultra efficient processing in-memory system , 2018, 2018 19th International Symposium on Quality Electronic Design (ISQED).

[9]  Mohsen Imani,et al.  Ultra-efficient processing in-memory for data intensive applications , 2017, 2017 54th ACM/EDAC/IEEE Design Automation Conference (DAC).

[10]  Vijayalakshmi Srinivasan,et al.  Approximate computing: Challenges and opportunities , 2016, 2016 IEEE International Conference on Rebooting Computing (ICRC).

[11]  J. Tukey,et al.  An Algorithm for the Machine Calculation of , 2016 .

[12]  Ahmed M. Eltawil,et al.  A Hybrid Approximate Computing Approach for Associative In-Memory Processors , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[13]  Rachata Ausavarungnirun,et al.  Enabling the Adoption of Processing-in-Memory: Challenges, Mechanisms, Future Research Directions , 2018, ArXiv.

[14]  Limin Li,et al.  Parallel 2D FFT implementation on FPGA suitable for real-time MR image processing. , 2018, The Review of scientific instruments.

[15]  Maya Gokhale,et al.  Hybrid memory cube performance characterization on data-centric workloads , 2015, IA3@SC.

[16]  Eby G. Friedman,et al.  AC-DIMM: associative computing with STT-MRAM , 2013, ISCA.

[17]  Xueti Tang,et al.  Spin-transfer torque magnetic random access memory (STT-MRAM) , 2013, JETC.

[18]  L. Bluestein A linear filtering approach to the computation of discrete Fourier transform , 1970 .

[19]  Chen-Yi Lee,et al.  A 2.4-Gsample/s DVFS FFT Processor for MIMO OFDM Communication Systems , 2008, IEEE Journal of Solid-State Circuits.

[20]  Tony Tae-Hyoung Kim,et al.  An Area Efficient 1024-Point Low Power Radix-22 FFT Processor With Feed-Forward Multiple Delay Commutators , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.

[21]  裕幸 飯田,et al.  International Technology Roadmap for Semiconductors 2003の要求清浄度について - シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について - , 2004 .

[22]  Tao Zhang,et al.  PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[23]  Jerry L. Potter Associative Computing: A Programming Paradigm for Massively Parallel Computers , 1992 .

[24]  L. Chua Memristor-The missing circuit element , 1971 .

[25]  Dake Liu,et al.  A High-Flexible Low-Latency Memory-Based FFT Processor for 4G, WLAN, and Future 5G , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[26]  Noah Treuhaft,et al.  Scalable Processors in the Billion-Transistor Era: IRAM , 1997, Computer.

[27]  A.N. Willson,et al.  A power-scalable reconfigurable FFT/IFFT IC based on a multi-processor ring , 2006, IEEE Journal of Solid-State Circuits.

[28]  Harold S. Stone,et al.  A Logic-in-Memory Computer , 1970, IEEE Transactions on Computers.

[29]  Ran Ginosar,et al.  Computer Architecture with Associative Processor Replacing Last-Level Cache and SIMD Accelerator , 2013, IEEE Transactions on Computers.

[30]  I. Grattan-Guinness,et al.  Joseph Fourier, Théorie analytique de la chaleur (1822) , 2005 .

[31]  B. Parhami,et al.  Content addressable parallel processors , 1978, Proceedings of the IEEE.

[32]  Big data needs a hardware revolution , 2018, Nature.

[33]  H.-S. Philip Wong,et al.  In-memory computing with resistive switching devices , 2018, Nature Electronics.

[34]  Gu-Yeon Wei,et al.  Profiling a warehouse-scale computer , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[35]  N. Vallepalli,et al.  SRAM design on 65nm CMOS technology with integrated leakage reduction scheme , 2004, 2004 Symposium on VLSI Circuits. Digest of Technical Papers (IEEE Cat. No.04CH37525).

[36]  Badih Ghazi,et al.  MRS Sparse-FFT: Reducing Acquisition Time and Artifacts for In Vivo 2D Correlation Spectroscopy , 2012 .

[37]  Yu Cao,et al.  Exploring sub-20nm FinFET design with Predictive Technology Models , 2012, DAC Design Automation Conference 2012.

[38]  Rachata Ausavarungnirun,et al.  The Processing-in-Memory Paradigm: Mechanisms to Enable Adoption , 2018, Beyond-CMOS Technologies for Next Generation Computer Design.

[39]  Dejan Markovic,et al.  Power and Area Minimization of Reconfigurable FFT Processors: A 3GPP-LTE Example , 2012, IEEE Journal of Solid-State Circuits.

[40]  C. Rader Discrete Fourier transforms when the number of data samples is prime , 1968 .

[41]  Mariagrazia Graziano,et al.  Exploiting the Logic-In-Memory paradigm for speeding-up data-intensive algorithms , 2019, Integr..

[42]  J Hennig,et al.  RARE imaging: A fast imaging method for clinical MR , 1986, Magnetic resonance in medicine.

[43]  Cong Xu,et al.  Pinatubo: A processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories , 2016, 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC).

[44]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[45]  Ahmed M. Eltawil,et al.  Power Performance Tradeoffs Using Adaptive Bit Width Adjustments on Resistive Associative Processors , 2019, IEEE Transactions on Circuits and Systems I: Regular Papers.

[46]  Hiroyuki Kawai,et al.  A 250-MHz 18-Mb Full Ternary CAM With Low-Voltage Matchline Sensing Scheme in 65-nm CMOS , 2013, IEEE Journal of Solid-State Circuits.

[47]  Sparsh Mittal,et al.  A Survey of ReRAM-Based Architectures for Processing-In-Memory and Neural Networks , 2018, Mach. Learn. Knowl. Extr..

[48]  Mariagrazia Graziano,et al.  New Logic-In-Memory Paradigms: An Architectural and Technological Perspective , 2019, Micromachines.

[49]  Rachata Ausavarungnirun,et al.  Processing Data Where It Makes Sense: Enabling In-Memory Computation , 2019, Microprocess. Microsystems.

[50]  David Blaauw,et al.  A 0.27V 30MHz 17.7nJ/transform 1024-pt complex FFT core with super-pipelining , 2011, 2011 IEEE International Solid-State Circuits Conference.

[51]  Ahmed M. Eltawil,et al.  Approximate Memristive In-memory Computing , 2017, ACM Trans. Embed. Comput. Syst..