MB-CNN: Memristive Binary Convolutional Neural Networks for Embedded Mobile Devices
暂无分享,去创建一个
Mahdi Nazm Bojnordi | Pranav Kulkarni | Arjun Pal Chowdhury | Pranav Kulkarni | Arjun Pal Chowdhury | Mahdi Nazm Bojnordi
[1] Duncan G. Elliott,et al. Computational RAM: Implementing Processors in Memory , 1999, IEEE Des. Test Comput..
[2] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[3] C. H. Cheng,et al. Ultralow Switching Energy Ni/$\hbox{GeO}_{x}$ /HfON/TaN RRAM , 2011, IEEE Electron Device Letters.
[4] Takeyoshi Ohashi,et al. Variability study with CD-SEM metrology for STT-MRAM: correlation analysis between physical dimensions and electrical property of the memory element , 2017, Advanced Lithography.
[5] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[6] Yan Li,et al. 128Gb 3b/cell NAND flash memory in 19nm technology with 18MB/s write rate and 400Mb/s toggle mode , 2012, 2012 IEEE International Solid-State Circuits Conference.
[7] M. Oskin,et al. Active Pages: a computation model for intelligent memory , 1998, Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No.98CB36235).
[8] Josep Torrellas,et al. WearCore: A core for wearable workloads? , 2016, 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT).
[9] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[10] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[11] Kai Li,et al. The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).
[12] Wei Wang,et al. Highly improved resistive switching performances of the self-doped Pt/HfO2:Cu/Cu devices by atomic layer deposition , 2016 .
[13] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[14] Sebastian Ehrlichmann. Vlsi Design Techniques For Analog And Digital Circuits , 2016 .
[15] Luca Benini,et al. XNORBIN: A 95 TOp/s/W hardware accelerator for binary convolutional neural networks , 2018, 2018 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS).
[16] Andrew B. Kahng,et al. CACTI-IO: CACTI With OFF-Chip Power-Area-Timing Models , 2015, IEEE Trans. Very Large Scale Integr. Syst..
[17] Khaled N. Salama,et al. Memristor-based memory: The sneak paths problem and solutions , 2013, Microelectron. J..
[18] Karthikeyan Sankaralingam,et al. Dark Silicon and the End of Multicore Scaling , 2012, IEEE Micro.
[19] F. Pellizzer,et al. Novel /spl mu/trench phase-change memory cell for embedded and stand-alone non-volatile memory applications , 2004, Digest of Technical Papers. 2004 Symposium on VLSI Technology, 2004..
[20] Eric Pop,et al. Energy-Efficient Phase-Change Memory with Graphene as a Thermal Barrier. , 2015, Nano letters.
[21] T. Yamamoto,et al. Low-power embedded ReRAM technology for IoT applications , 2015, 2015 Symposium on VLSI Technology (VLSI Technology).
[22] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[23] Payman Behnam,et al. Accelerating $k$ -Medians Clustering Using a Novel 4T-4R RRAM Cell , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[24] Behzad Razavi,et al. Principles of Data Conversion System Design , 1994 .
[25] Tao Zhang,et al. Overcoming the challenges of crossbar resistive memory architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[26] Tao Zhang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[27] Cong Xu,et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[28] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[29] Moinuddin K. Qureshi,et al. Morphable memory system: a robust architecture for exploiting multi-level phase change memories , 2010, ISCA.
[30] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.
[31] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[32] Dara Rahmati,et al. A Performance and Power Analysis of WK-Recursive and Mesh Networks for Network-on-Chips , 2006, 2006 International Conference on Computer Design.
[33] Qi Liu,et al. Super non-linear RRAM with ultra-low power for 3D vertical nano-crossbar arrays. , 2016, Nanoscale.
[34] Shaahin Angizi,et al. IMCE: Energy-efficient bit-wise in-memory convolution engine for deep neural network , 2018, 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC).
[35] Tohru Ozaki,et al. A 100 MHz Ladder FeRAM Design With Capacitance-Coupled-Bitline (CCB) Cell , 2011, IEEE Journal of Solid-State Circuits.
[36] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[37] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[38] Mikko H. Lipasti,et al. BenchNN: On the broad potential application scope of hardware neural network accelerators , 2012, 2012 IEEE International Symposium on Workload Characterization (IISWC).
[39] Minje Kim,et al. XNOR-POP: A processing-in-memory architecture for binary Convolutional Neural Networks in Wide-IO2 DRAMs , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[40] Jung Ho Ahn,et al. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[41] Soheil Ghiasi,et al. Fast and Energy-Efficient CNN Inference on IoT Devices , 2016, ArXiv.
[42] Glenn Reinman,et al. BRAINIAC: Bringing reliable accuracy into neurally-implemented approximate computing , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).
[43] Tohru Ozaki,et al. A 64-Mb Chain FeRAM With Quad BL Architecture and 200 MB/s Burst Mode , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[44] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[45] E. Vianello,et al. 28nm advanced CMOS resistive RAM solution as embedded non-volatile memory , 2014, 2014 IEEE International Reliability Physics Symposium.
[46] Yixin Chen,et al. Compressing Neural Networks with the Hashing Trick , 2015, ICML.
[47] Maya Gokhale,et al. Processing in Memory: The Terasys Massively Parallel PIM Array , 1995, Computer.
[48] J. Yang,et al. Memristive switching mechanism for metal/oxide/metal nanodevices. , 2008, Nature nanotechnology.
[49] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[50] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[51] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[52] Zhitang Song,et al. Superlattice-like GeTe/Sb thin film for ultra-high speed phase change memory applications , 2017 .
[53] Engin Ipek,et al. Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning , 2017 .
[54] Jose Renau,et al. ESESC: A fast multicore simulator using Time-Based Sampling , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).
[55] F. Zeng,et al. Recent progress in resistive random access memories: Materials, switching mechanisms, and performance , 2014 .
[56] Yu Wang,et al. Binary convolutional neural network on RRAM , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[57] Narayanan Vijaykrishnan,et al. Nonvolatile Processor Architectures: Efficient, Reliable Progress with Unstable Power , 2016, IEEE Micro.
[58] Yuan Gao,et al. RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[59] Yukio Hayakawa,et al. An 8 Mb Multi-Layered Cross-Point ReRAM Macro With 443 MB/s Write Throughput , 2012, IEEE Journal of Solid-State Circuits.
[60] Chris Yakopcic,et al. Memristor-based neuron circuit and method for applying learning algorithm in SPICE? , 2014 .
[61] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[62] Andrew Y. Ng,et al. Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .
[63] Alexander Gruenstein,et al. Accurate and compact large vocabulary speech recognition on mobile devices , 2013, INTERSPEECH.
[64] Igor Carron,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .
[65] Ming-Jinn Tsai,et al. Low-Power MCU With Embedded ReRAM Buffers as Sensor Hub for IoT Applications , 2016, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[66] Hyunsang Hwang,et al. Materials and process aspect of cross-point RRAM (invited) , 2011 .
[67] Cong Xu,et al. Design trade-offs for high density cross-point resistive memory , 2012, ISLPED '12.
[68] Hai Li,et al. A practical low-power memristor-based analog neural branch predictor , 2013, International Symposium on Low Power Electronics and Design (ISLPED).
[69] Cong Xu,et al. Design implications of memristor-based RRAM cross-point structures , 2011, 2011 Design, Automation & Test in Europe.
[70] Gang Hua,et al. How to Train a Compact Binary Neural Network with High Accuracy? , 2017, AAAI.
[71] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[72] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[73] Soheil Ghiasi,et al. CNNdroid: GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android , 2015, ACM Multimedia.
[74] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[75] Yiran Chen,et al. Design Margin Exploration of Spin-Transfer Torque RAM (STT-RAM) in Scaled Technologies , 2010, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[76] Borivoje Nikolic,et al. A Differential 2R Crosspoint RRAM Array With Zero Standby Current , 2015, IEEE Transactions on Circuits and Systems II: Express Briefs.
[77] Ming Yang,et al. Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.
[78] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[79] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[80] Walt Kester,et al. The data conversion handbook , 2005 .
[81] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[82] Luis Ceze,et al. Neural Acceleration for General-Purpose Approximate Programs , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.
[83] Albert Chin,et al. Novel Ultra-low power RRAM with good endurance and retention , 2010, 2010 Symposium on VLSI Technology.
[84] Jun Yang,et al. A durable and energy efficient main memory using phase change memory technology , 2009, ISCA '09.
[85] Yiran Chen,et al. Multi-level cell STT-RAM: Is it realistic or just a dream? , 2012, 2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[86] Ming Yang,et al. DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.