Racetrack memory based logic design for in-memory computing
暂无分享,去创建一个
[1] Beng Chin Ooi,et al. In-Memory Big Data Management and Processing: A Survey , 2015, IEEE Transactions on Knowledge and Data Engineering.
[2] Thomas Blum,et al. Montgomery modular exponentiation on reconfigurable hardware , 1999, Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336).
[3] William J. Dally,et al. Smart Memories: a modular reconfigurable architecture , 2000, ISCA '00.
[4] Akashi Satoh,et al. Systematic Design of RSA Processors Based on High-Radix Montgomery Multipliers , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[5] Jeng-Shyang Pan,et al. Low-Complexity Digit-Serial and Scalable SPB/GPB Multipliers Over Large Binary Extension Fields Using (b,2)-Way Karatsuba Decomposition , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.
[6] K. Patel,et al. Implementing Digital Signature with RSA Encryption Algorithm to Enhance the Data Security of Cloud in Cloud Computing , 2016 .
[7] Christoforos E. Kozyrakis,et al. A case for intelligent RAM , 1997, IEEE Micro.
[8] Kaushik Roy,et al. TapeCache: a high density, energy efficient cache based on domain wall memory , 2012, ISLPED '12.
[9] Jacques-Olivier Klein,et al. Ultra Low Power Magnetic Flip-Flop Based on Checkpointing/Power Gating and Self-Enable Mechanisms , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.
[10] Z. Wei,et al. Highly reliable TaOx ReRAM and direct evidence of redox reaction mechanism , 2008, 2008 IEEE International Electron Devices Meeting.
[11] J. McCanny,et al. Modified Montgomery modular multiplication and RSA exponentiation techniques , 2004 .
[12] Michael J. Flynn,et al. Very high-speed computing systems , 1966 .
[13] Q. Stainer,et al. MRAM with soft reference layer: In-stack combination of memory and logic functions , 2013, 2013 5th IEEE International Memory Workshop.
[14] Cheng-Wen Wu,et al. An improved Montgomery's algorithm for high-speed RSA public-key cryptosystem , 1999, IEEE Trans. Very Large Scale Integr. Syst..
[15] Weisheng Zhao,et al. Perpendicular-magnetic-anisotropy CoFeB racetrack memory , 2012 .
[16] Ehsan Atoofian,et al. Reducing shift penalty in Domain Wall Memory through register locality , 2015, 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).
[17] Laurent Imbert,et al. Parallel Modular Multiplication on Multi-core Processors , 2013, 2013 IEEE 21st Symposium on Computer Arithmetic.
[18] Frederick T. Chen,et al. Low power and high speed bipolar switching with a thin reactive Ti buffer layer in robust HfO2 based RRAM , 2008, 2008 IEEE International Electron Devices Meeting.
[19] Onur Mutlu,et al. Architecting phase change memory as a scalable dram alternative , 2009, ISCA '09.
[20] Mohamad Towfik Krounbi,et al. Basic principles of STT-MRAM cell operation in memory arrays , 2013 .
[21] Luan Tran,et al. 45nm low power CMOS logic compatible embedded STT MRAM utilizing a reverse-connection 1T/1MTJ cell , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).
[22] Ming-Der Shieh,et al. Scalable Montgomery Modular Multiplication Architecture with Low-Latency and Low-Memory Bandwidth Requirement , 2014, IEEE Transactions on Computers.
[23] P. L. Montgomery. Modular multiplication without trial division , 1985 .
[24] C. Rettner,et al. Current-Controlled Magnetic Domain-Wall Nanowire Shift Register , 2008, Science.
[25] Joonyoung Kim,et al. HBM: Memory solution for bandwidth-hungry processors , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[26] Mircea R. Stan,et al. Relaxing non-volatility for fast and energy-efficient STT-RAM caches , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.
[27] F. Pellizzer,et al. Optimization metrics for Phase Change Memory (PCM) cell architectures , 2014, 2014 IEEE International Electron Devices Meeting.
[28] Sanu Mathew,et al. An improved unified scalable radix-2 Montgomery multiplier , 2005, 17th IEEE Symposium on Computer Arithmetic (ARITH'05).
[29] Chih-Wei Liu,et al. Design and Implementation of High-Speed and Energy-Efficient Variable-Latency Speculating Booth Multiplier (VLSBM) , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.
[30] S. O. Park,et al. Highly scalable nonvolatile resistive memory using simple binary oxide driven by asymmetric unipolar voltage pulses , 2004, IEDM Technical Digest. IEEE International Electron Devices Meeting, 2004..
[31] Jiwu Shu,et al. Exploring data placement in racetrack memory based scratchpad memory , 2015, 2015 IEEE Non-Volatile Memory System and Applications Symposium (NVMSA).
[32] Sunggu Lee,et al. Accelerating graph computation with racetrack memory and pointer-assisted graph representation , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[33] Colin D. Walter,et al. Hardware Implementation of Montgomery's Modular Multiplication Algorithm , 1993, IEEE Trans. Computers.
[34] Kaushik Roy,et al. DWM-TAPESTRI - An energy efficient all-spin cache using domain wall shift based writes , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[35] Massimiliano Di Ventra,et al. On the physical properties of memristive, memcapacitive and meminductive systems , 2013, Nanotechnology.
[36] Zhao Zhang,et al. Mini-rank: Adaptive DRAM architecture for improving memory power efficiency , 2008, 2008 41st IEEE/ACM International Symposium on Microarchitecture.
[37] D. H. Jacobsohn,et al. A Suggestion for a Fast Multiplier , 1964, IEEE Trans. Electron. Comput..
[38] Eric Pop,et al. Phase change materials and phase change memory , 2014 .
[39] Jacques-Olivier Klein,et al. Magnetic Adder Based on Racetrack Memory , 2013, IEEE Transactions on Circuits and Systems I: Regular Papers.
[40] F. Pellizzer,et al. Novel /spl mu/trench phase-change memory cell for embedded and stand-alone non-volatile memory applications , 2004, Digest of Technical Papers. 2004 Symposium on VLSI Technology, 2004..
[41] Vijayalakshmi Srinivasan,et al. Scalable high performance main memory system using phase-change memory technology , 2009, ISCA '09.
[42] Eric Belhaire,et al. New non‐volatile logic based on spin‐MTJ , 2008 .
[43] Holger Orup,et al. Simplifying quotient determination in high-radix modular multiplication , 1995, Proceedings of the 12th Symposium on Computer Arithmetic.
[44] Jean-Pierre Seifert,et al. A new CRT-RSA algorithm secure against bellcore attacks , 2003, CCS '03.
[45] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[46] Peter Kornerup. High-radix modular multiplication for cryptosystems , 1993, Proceedings of IEEE 11th Symposium on Computer Arithmetic.
[47] Mahmut T. Kandemir,et al. Leakage Current: Moore's Law Meets Static Power , 2003, Computer.
[48] Stuart S. P. Parkin,et al. Memory on the Racetrack , 2015 .
[49] Weisheng Zhao,et al. Low Power Magnetic Full-Adder Based on Spin Transfer Torque MRAM , 2013, IEEE Transactions on Magnetics.
[50] J. Thomas Pawlowski,et al. Hybrid memory cube (HMC) , 2011, 2011 IEEE Hot Chips 23 Symposium (HCS).
[51] Juan Carlos López,et al. Design and implementation of a coprocessor for cryptography applications , 1997, Proceedings European Design and Test Conference. ED & TC 97.
[52] Per-Åke Larson,et al. Modern Main-Memory Database Systems , 2016, Proc. VLDB Endow..
[53] Ehsan Atoofian,et al. Shift-aware racetrack memory , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).
[54] Swaroop Ghosh,et al. Exploiting Serial Access and Asymmetric Read/Write of Domain Wall Memory for Area and Energy-Efficient Digital Signal Processor Design , 2016, IEEE Transactions on Circuits and Systems I: Regular Papers.
[55] Jean-Luc Gaudiot,et al. A Simple High-Speed Multiplier Design , 2006, IEEE Transactions on Computers.
[56] Ming-Der Shieh,et al. Word-Based Montgomery Modular Multiplication Algorithm for Low-Latency Scalable Architectures , 2010, IEEE Transactions on Computers.
[57] Keshab K. Parhi,et al. Design of low-error fixed-width modified booth multiplier , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[58] Ming-Der Shieh,et al. A New Modular Exponentiation Architecture for Efficient Design of RSA Cryptosystem , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[59] Çetin Kaya Koç,et al. High-Radix Design of a Scalable Modular Multiplier , 2001, CHES.
[60] Yue Zhang,et al. Ultra-High Density Content Addressable Memory Based on Current Induced Domain Wall Motion in Magnetic Track , 2012, IEEE Transactions on Magnetics.
[61] Akashi Satoh,et al. A Scalable Dual-Field Elliptic Curve Cryptographic Processor , 2003, IEEE Trans. Computers.
[62] Tarek A. El-Ghazawi,et al. New Hardware Architectures for Montgomery Modular Multiplication Algorithm , 2011, IEEE Transactions on Computers.
[63] Ming-Der Shieh,et al. A High-Performance Unified-Field Reconfigurable , 2010 .
[64] Keke Gai,et al. Phase-Change Memory Optimization for Green Cloud with Genetic Algorithm , 2015, IEEE Transactions on Computers.
[65] C. D. Walter,et al. Systolic Modular Multiplication , 1993, IEEE Trans. Computers.
[66] Yun Liang,et al. Performance-Centric Optimization for Racetrack Memory Based Register File on GPUs , 2016, Journal of Computer Science and Technology.
[67] Alfred Menezes,et al. Guide to Elliptic Curve Cryptography , 2004, Springer Professional Computing.
[68] 裕幸 飯田,et al. International Technology Roadmap for Semiconductors 2003の要求清浄度について - シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について - , 2004 .
[69] Jun Yang,et al. Exploit common source-line to construct energy efficient domain wall memory based caches , 2015, 2015 33rd IEEE International Conference on Computer Design (ICCD).
[70] Jian-Ping Wang,et al. A spintronics full adder for magnetic CPU , 2005 .
[71] Duncan G. Elliott,et al. Computational RAM: Implementing Processors in Memory , 1999, IEEE Des. Test Comput..
[72] Scott Hauck,et al. Reconfigurable computing: a survey of systems and software , 2002, CSUR.
[73] S. Parkin,et al. Magnetic Domain-Wall Racetrack Memory , 2008, Science.
[74] Rami G. Melhem,et al. Multilane Racetrack caches: Improving efficiency through compression and independent shifting , 2015, The 20th Asia and South Pacific Design Automation Conference.
[75] H. Ohno,et al. Fabrication of a Nonvolatile Full Adder Based on Logic-in-Memory Architecture Using Magnetic Tunnel Junctions , 2008 .
[76] Sparsh Mittal,et al. A Survey of Techniques for Architecting and Managing GPU Register File , 2017, IEEE Transactions on Parallel and Distributed Systems.
[77] Jaewook Shin,et al. Mapping Irregular Applications to DIVA, a PIM-based Data-Intensive Architecture , 1999, ACM/IEEE SC 1999 Conference (SC'99).
[78] Adi Shamir,et al. A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.
[79] Kiyoung Choi,et al. A scalable processing-in-memory accelerator for parallel graph processing , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[80] Duane Mills,et al. 19.7 A 16Gb ReRAM with 200MB/s write and 1GB/s read in 27nm technology , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[81] Yu Wang,et al. Hi-fi playback: Tolerating position errors in shift operations of racetrack memory , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[82] T. Kuroda. CMOS design challenges to power wall , 2001, Digest of Papers. Microprocesses and Nanotechnology 2001. 2001 International Microprocesses and Nanotechnology Conference (IEEE Cat. No.01EX468).
[83] Issa M. Khalil,et al. Cloud Computing Security: A Survey , 2014, Comput..
[84] Haomin Wu,et al. A new design of the CMOS full adder , 1992 .
[85] G. Finocchio,et al. A strategy for the design of skyrmion racetrack memories , 2014, Scientific Reports.
[86] Yajun Ha,et al. A Low Active Leakage and High Reliability Phase Change Memory (PCM) Based Non-Volatile FPGA Storage Element , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.
[87] Bernard Dieny,et al. Synchronous 8-bit Non-Volatile Full-Adder based on Spin Transfer Torque Magnetic Tunnel Junction , 2015, IEEE Transactions on Circuits and Systems I: Regular Papers.
[88] Tarek Darwish,et al. Performance analysis of low-power 1-bit CMOS full adder cells , 2002, IEEE Trans. Very Large Scale Integr. Syst..
[89] Kaushik Roy,et al. Energy-Efficient All-Spin Cache Hierarchy Using Shift-Based Writes and Multilevel Storage , 2015, ACM J. Emerg. Technol. Comput. Syst..
[90] Alexander Albicki,et al. Low power and high speed multiplication design through mixed number representations , 1995, Proceedings of ICCD '95 International Conference on Computer Design. VLSI in Computers and Processors.
[91] Tolga Acar,et al. Analyzing and comparing Montgomery multiplication algorithms , 1996, IEEE Micro.
[92] Burton S. Kaliski,et al. The Montgomery Inverse and Its Applications , 1995, IEEE Trans. Computers.
[93] Fabien Clermidy,et al. Bipolar ReRAM Based non-volatile flip-flops for low-power architectures , 2012, 10th IEEE International NEWCAS Conference.
[94] Dejan Markovic,et al. True Energy-Performance Analysis of the MTJ-Based Logic-in-Memory Architecture (1-Bit Full Adder) , 2010, IEEE Transactions on Electron Devices.
[95] Craig Gentry,et al. Fully homomorphic encryption using ideal lattices , 2009, STOC '09.
[96] Hai Li,et al. Quantitative modeling of racetrack memory, a tradeoff among area, performance, and power , 2015, The 20th Asia and South Pacific Design Automation Conference.
[97] Yiran Chen,et al. Exploration of GPGPU register file architecture using domain-wall-shift-write based racetrack memory , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[98] Kaushik Roy,et al. STAG: Spintronic-Tape Architecture for GPGPU cache hierarchies , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).
[99] Sanghyeon Lee,et al. Enhanced cycling endurance in phase change memory via electrical control of switching induced atomic migration , 2014, 2014 14th Annual Non-Volatile Memory Technology Symposium (NVMTS).
[100] Orest J. Bedrij. Carry-Select Adder , 1962, IRE Trans. Electron. Comput..
[101] Kaushik Roy,et al. Cache Design with Domain Wall Memory , 2016, IEEE Transactions on Computers.
[102] M.-J. Hsiao,et al. Carry-select adder using single ripple-carry adder , 1998 .
[103] H-S Philip Wong,et al. Memory leads the way to better computing. , 2015, Nature nanotechnology.
[104] Victor S. Miller,et al. Use of Elliptic Curves in Cryptography , 1985, CRYPTO.
[105] Hao Yu,et al. Energy efficient in-memory AES encryption based on nonvolatile domain-wall nanowire , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[106] Yiran Chen,et al. Compiler-assisted refresh minimization for volatile STT-RAM cache , 2013, 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC).
[107] Wenqing Wu,et al. Cross-layer racetrack memory design for ultra high density and low power consumption , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).
[108] Amar Mandal,et al. Tripartite Modular Multiplication using Toom-Cook Multiplication , 2012 .
[109] Rami G. Melhem,et al. ContextPreRF: Enhancing the Performance and Energy of GPUs With Nonuniform Register Access , 2016, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[110] G. Huang,et al. An Energy-Efficient Nonvolatile In-Memory Computing Architecture for Extreme Learning Machine by Domain-Wall Nanowire Devices , 2015, IEEE Transactions on Nanotechnology.
[111] Mahmut T. Kandemir,et al. Evaluating STT-RAM as an energy-efficient main memory alternative , 2013, 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[112] Hari Balakrishnan,et al. CryptDB: protecting confidentiality with encrypted query processing , 2011, SOSP.
[113] Yi Gang,et al. A High-Reliability, Low-Power Magnetic Full Adder , 2011, IEEE Transactions on Magnetics.
[114] Kailash Gopalakrishnan,et al. Overview of candidate device technologies for storage-class memory , 2008, IBM J. Res. Dev..
[115] Yiran Chen,et al. Design of Last-Level On-Chip Cache Using Spin-Torque Transfer RAM (STT RAM) , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[116] Çetin Kaya Koç,et al. A Scalable Architecture for Montgomery Multiplication , 1999, CHES.
[117] G. Servalli,et al. A 45nm generation Phase Change Memory technology , 2009, 2009 IEEE International Electron Devices Meeting (IEDM).
[118] Jun Yang,et al. A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.
[119] Sascha Vongehr,et al. The Missing Memristor has Not been Found , 2015, Scientific Reports.
[120] Çetin Kaya Koç,et al. A Scalable Architecture for Modular Multiplication Based on Montgomery's Algorithm , 2003, IEEE Trans. Computers.
[121] Tian-Sheuan Chang,et al. A new RSA cryptosystem hardware design based on Montgomery's algorithm , 1998 .
[122] Anand Raghunathan,et al. Domain-Specific Many-core Computing using Spin-based Memory , 2014, IEEE Transactions on Nanotechnology.
[123] R. Schaller,et al. Moore's law: past, present and future , 1997 .
[124] Wenqing Wu,et al. Multi retention level STT-RAM cache designs with a dynamic refresh scheme , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[125] Haralampos Pozidis,et al. Recent Progress in Phase-Change Memory Technology , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[126] Makoto Motoyoshi,et al. Through-Silicon Via (TSV) , 2009, Proceedings of the IEEE.
[127] H.-S. Philip Wong,et al. Phase Change Memory , 2010, Proceedings of the IEEE.
[128] N. Koblitz. Elliptic curve cryptosystems , 1987 .
[129] Nisha Checka,et al. Technology, performance, and computer-aided design of three-dimensional integrated circuits , 2004, ISPD '04.
[130] Colin D. Walter. Space/Time Trade-Offs for Higher Radix Modular Multiplication Using Repeated Addition , 1997, IEEE Trans. Computers.