Liquid Silicon-Monona: A Reconfigurable Memory-Oriented Computing Fabric with Scalable Multi-Context Support
暂无分享,去创建一个
[1] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, NIPS.
[2] Yajun Ha,et al. A Low Active Leakage and High Reliability Phase Change Memory (PCM) Based Non-Volatile FPGA Storage Element , 2014, IEEE Transactions on Circuits and Systems I: Regular Papers.
[3] Malgorzata Marek-Sadowska,et al. Partitioning Sequential Circuits on Dynamically Reconfigurable FPGAs , 1999, IEEE Trans. Computers.
[4] André DeHon,et al. Location, location, location: the role of spatial locality in asymptotic energy minimization , 2013, FPGA '13.
[5] Eriko Nurvitadhi,et al. Accelerating Binarized Neural Networks: Comparison of FPGA, CPU, GPU, and ASIC , 2016, 2016 International Conference on Field-Programmable Technology (FPT).
[6] Robert K. Brayton,et al. Combinational and sequential mapping with priority cuts , 2007, 2007 IEEE/ACM International Conference on Computer-Aided Design.
[7] Ran Ginosar,et al. Resistive Associative Processor , 2015, IEEE Computer Architecture Letters.
[8] Jason Cong,et al. mrFPGA: A novel FPGA architecture with memristor-based reconfiguration , 2011, 2011 IEEE/ACM International Symposium on Nanoscale Architectures.
[9] Shinichi Yasuda,et al. A pure-CMOS nonvolatile multi-context configuration memory for dynamically reconfigurable FPGAs , 2014, International Conference on Field-Programmable Technology.
[10] Fadi J. Kurdahi,et al. A framework for reconfigurable computing: task scheduling and context management , 2001, IEEE Trans. Very Large Scale Integr. Syst..
[11] Yu Zhang,et al. Enabling FPGAs in the cloud , 2014, Conf. Computing Frontiers.
[12] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[13] Ashish Goel,et al. Similarity search and locality sensitive hashing using ternary content addressable memories , 2010, SIGMOD Conference.
[14] Dave Brown,et al. Supplementary Material for An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing , 2013 .
[15] Jason Luu,et al. VPR 5.0: FPGA cad and architecture exploration tools with single-driver routing, heterogeneity and process scaling , 2009, FPGA '09.
[16] Eby G. Friedman,et al. AC-DIMM: associative computing with STT-MRAM , 2013, ISCA.
[17] An Chen,et al. A Comprehensive Crossbar Array Model With Solutions for Line Resistance and Nonlinear Device Characteristics , 2013, IEEE Transactions on Electron Devices.
[18] Xi Chen,et al. FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[19] Carl Ebeling,et al. Hardware Acceleration of Short Read Mapping , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.
[20] Bharat Sukhwani,et al. Accelerating Join Operation for Relational Databases with FPGAs , 2013, 2013 IEEE 21st Annual International Symposium on Field-Programmable Custom Computing Machines.
[21] Javier Resano,et al. Specific scheduling support to minimize the reconfiguration overhead of dynamically reconfigurable hardware , 2004, Proceedings. 41st Design Automation Conference, 2004..
[22] Rajesh Gupta,et al. Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs , 2017, FPGA.
[23] Jacques-Olivier Klein,et al. MRAM crossbar based configurable logic block , 2012, 2012 IEEE International Symposium on Circuits and Systems.
[24] Yong Wang,et al. SDA: Software-defined accelerator for large-scale DNN systems , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[25] Weirong Jiang. Scalable Ternary Content Addressable Memory implementation using FPGAs , 2013, Architectures for Networking and Communications Systems.
[26] Alberto Leon-Garcia,et al. FPGAs in the Cloud: Booting Virtualized Hardware Accelerators with OpenStack , 2014, 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines.
[27] Paolo Ienne,et al. Virtualized Execution Runtime for FPGA Accelerators in the Cloud , 2017, IEEE Access.
[28] K. Pagiamtzis,et al. Content-addressable memory (CAM) circuits and architectures: a tutorial and survey , 2006, IEEE Journal of Solid-State Circuits.
[29] Meng-Fan Chang,et al. 17.5 A 3T1R nonvolatile TCAM using MLC ReRAM with Sub-1ns search time , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.
[30] Seth Copen Goldstein,et al. Configuration Caching and Swapping , 2001, FPL.
[31] Meng-Fan Chang,et al. 7.4 A 256b-wordlength ReRAM-based TCAM with 1ns search-time and 14× improvement in wordlength-energyefficiency-density product using 2.5T1R cell , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).
[32] Andrew S. Cassidy,et al. Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.
[33] Engin Ipek,et al. A resistive TCAM accelerator for data-intensive computing , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[34] Yukio Hayakawa,et al. An 8 Mb Multi-Layered Cross-Point ReRAM Macro With 443 MB/s Write Throughput , 2012, IEEE Journal of Solid-State Circuits.
[35] Kristen Grauman,et al. Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.
[36] Tuo-Hung Hou,et al. Resistive random access memory (RRAM) technology: From material, device, selector, 3D integration to bottom-up fabrication , 2017, Journal of Electroceramics.
[37] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[38] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[39] Dennis W. Prather,et al. FPGA-based acceleration of the 3D finite-difference time-domain method , 2004, 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.
[40] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[41] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[42] Qinru Qiu,et al. SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing , 2016, ASPLOS.
[43] A. Tsai,et al. PipeRench: A virtualized programmable datapath in 0.18 micron technology , 2002, Proceedings of the IEEE 2002 Custom Integrated Circuits Conference (Cat. No.02CH37285).
[44] Kinam Kim,et al. A fast, high-endurance and scalable non-volatile memory device made from asymmetric Ta2O(5-x)/TaO(2-x) bilayer structures. , 2011, Nature materials.
[45] Steven Trimberger,et al. A time-multiplexed FPGA , 1997, Proceedings. The 5th Annual IEEE Symposium on Field-Programmable Custom Computing Machines Cat. No.97TB100186).
[46] Paris Smaragdis,et al. Bitwise Neural Networks , 2016, ArXiv.
[47] Yung-Hsiang Lu,et al. Large-scale Image Processing using Amazon EC2 Spot Instances , 2016, Image Quality and System Performance.
[48] Jeremy Buhler,et al. Efficient large-scale sequence comparison by locality-sensitive hashing , 2001, Bioinform..
[49] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[50] Kenneth B. Kent,et al. VPR 5.0: FPGA CAD and architecture exploration tools with single-driver routing, heterogeneity and process scaling , 2011, TRETS.
[51] Zhiyuan Li,et al. Configuration caching management techniques for reconfigurable computing , 2000, Proceedings 2000 IEEE Symposium on Field-Programmable Custom Computing Machines (Cat. No.PR00871).
[52] S. Jo,et al. 3D-stackable crossbar resistive memory based on Field Assisted Superlinear Threshold (FAST) selector , 2014, 2014 IEEE International Electron Devices Meeting.
[53] Christoforos E. Kozyrakis,et al. Evaluating MapReduce for Multi-core and Multiprocessor Systems , 2007, 2007 IEEE 13th International Symposium on High Performance Computer Architecture.
[54] S. Yang,et al. Logic Synthesis and Optimization Benchmarks User Guide Version 3.0 , 1991 .
[55] Yu Wang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[56] Engin Ipek,et al. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning , 2017, 2017 Fifth Berkeley Symposium on Energy Efficient Electronic Systems & Steep Transistors Workshop (E3S).
[57] Sudhakar Yalamanchili,et al. Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[58] Takahiro Hanyu,et al. Nonvolatile field-programmable gate array using 2-transistor–1-MTJ-cell-based multi-context array for power and area efficient dynamically reconfigurable logic , 2015 .
[59] Sung-Mo Kang,et al. RRAM-based TCAMs for pattern search , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).
[60] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[61] R. Williams,et al. Sub-nanosecond switching of a tantalum oxide memristor , 2011, Nanotechnology.
[62] Shimeng Yu,et al. Metal–Oxide RRAM , 2012, Proceedings of the IEEE.
[63] Ryutaro Yasuhara,et al. Switching and reliability mechanisms for ReRAM , 2014, IEEE International Interconnect Technology Conference.
[64] James C. Hoe,et al. CoRAM: an in-fabric memory architecture for FPGA-based computing , 2011, FPGA '11.
[65] Henry Hoffmann,et al. The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs , 2002, IEEE Micro.
[66] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[67] Yusuf Leblebici,et al. GMS: Generic memristive structure for non-volatile FPGAs , 2012, 2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC).
[68] Jason Cong,et al. FPGA-RPI: A Novel FPGA Architecture With RRAM-Based Programmable Interconnects , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[69] Wei Zhang,et al. Non-volatile 3D stacking RRAM-based FPGA , 2012, 22nd International Conference on Field Programmable Logic and Applications (FPL).
[70] Fadi J. Kurdahi,et al. Configuration management in multi-context reconfigurable systems for simultaneous performance and power optimizations , 2000, ISSS '00.
[71] Z. Wei,et al. Highly reliable TaOx ReRAM and direct evidence of redox reaction mechanism , 2008, 2008 IEEE International Electron Devices Meeting.
[72] Hari Angepat,et al. A cloud-scale acceleration architecture , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[73] André DeHon,et al. DPGA Utilization and Application , 1996, Fourth International ACM Symposium on Field-Programmable Gate Arrays.
[74] Antonio Torralba,et al. Decoder-driven switching matrices in multicontext FPGAs: area reduction and their effect on routability , 1999, ISCAS'99. Proceedings of the 1999 IEEE International Symposium on Circuits and Systems VLSI (Cat. No.99CH36349).
[75] Kermin Fleming,et al. Leap scratchpads: automatic memory and cache management for reconfigurable logic , 2010, FPGA '11.
[76] Xuehai Zhou,et al. PuDianNao: A Polyvalent Machine Learning Accelerator , 2015, ASPLOS.
[77] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[78] Kizheppatt Vipin,et al. Virtualized FPGA Accelerators for Efficient Cloud Computing , 2015, 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom).
[79] Jing Li,et al. 1 Mb 0.41 µm² 2T-2R Cell Nonvolatile TCAM With Two-Bit Encoding and Clocked Self-Referenced Sensing , 2014, IEEE Journal of Solid-State Circuits.
[80] Zhe Wang,et al. Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.
[81] Jonathan S. Turner,et al. ClassBench: A Packet Classification Benchmark , 2005, IEEE/ACM Transactions on Networking.
[82] Wei Zhang,et al. Melia: A MapReduce Framework on OpenCL-Based FPGAs , 2016, IEEE Transactions on Parallel and Distributed Systems.
[83] Yuichiro Shibata,et al. A prototype chip of multicontext FPGA with DRAM for virtual hardware , 2001, ASP-DAC '01.
[84] Jari Nurmi,et al. Static scheduling techniques for dependent tasks on dynamically reconfigurable devices , 2007, J. Syst. Archit..
[85] Sen Wang,et al. VTR 7.0: Next Generation Architecture and CAD System for FPGAs , 2014, TRETS.
[86] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[87] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[88] Jing Li,et al. Reconfigurable in-memory computing with resistive memory crossbar , 2016, 2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[89] R. Njuguna. A Survey of FPGA Benchmarks , 2008 .
[90] Jung Ho Ahn,et al. DRAMA: An Architecture for Accelerated Processing Near Memory , 2015, IEEE Computer Architecture Letters.
[91] Francky Catthoor,et al. A hybrid prefetch scheduling heuristic to minimize at run-time the reconfiguration overhead of dynamically reconfigurable hardware [multimedia applications] , 2005, Design, Automation and Test in Europe.
[92] Peilin Song,et al. 1Mb 0.41 µm2 2T-2R cell nonvolatile TCAM with two-bit encoding and clocked self-referenced sensing , 2013, 2013 Symposium on VLSI Circuits.
[93] Jing Li,et al. Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network , 2017, FPGA.
[94] Abbas El Gamal,et al. Nonvolatile 3D-FPGA with monolithically stacked RRAM-based configuration memory , 2012, 2012 IEEE International Solid-State Circuits Conference.
[95] Patrick Pantel,et al. Randomized Algorithms and NLP: Using Locality Sensitive Hash Functions for High Speed Noun Clustering , 2005, ACL.
[96] Rainer G. Spallek,et al. RC3E: Provision and Management of Reconfigurable Hardware Accelerators in a Cloud Environment , 2015, ArXiv.