MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks
暂无分享,去创建一个
Jaewon Lee | Jangwoo Kim | Joonsung Kim | Hanhwi Jang | Jae-Eon Jo | Jaewon Lee | Jang-Hyun Kim | Joonsung Kim | Hanhwi Jang | Jae-Eon Jo
[1] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[2] Diana Marculescu,et al. Layer-compensated Pruning for Resource-constrained Convolutional Neural Networks , 2018, ArXiv.
[3] Mengjia Yan,et al. UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[4] André Seznec,et al. Practical data value speculation for future high-end processors , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).
[5] Rajesh K. Gupta,et al. SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[6] Boris Murmann,et al. A Pixel Pitch-Matched Ultrasound Receiver for 3-D Photoacoustic Imaging With Integrated Delta-Sigma Beamformer in 28-nm UTBB FD-SOI , 2017, IEEE Journal of Solid-State Circuits.
[7] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[8] David A. Wood,et al. LogCA: A high-level performance model for hardware accelerators , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[9] Jason Weston,et al. Key-Value Memory Networks for Directly Reading Documents , 2016, EMNLP.
[10] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[11] Gianluca Palermo,et al. mARGOt: A Dynamic Autotuning Framework for Self-Aware Approximate Computing , 2019, IEEE Transactions on Computers.
[12] Eric Cheng,et al. Very Low Voltage (VLV) Design , 2017, 2017 IEEE International Conference on Computer Design (ICCD).
[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[14] Mario Badr,et al. Load Value Approximation , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[15] Gu-Yeon Wei,et al. On-Chip Deep Neural Network Storage with Multi-Level eNVM , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[16] Pradeep Dubey,et al. SCALEDEEP: A scalable compute architecture for learning and evaluating deep networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[17] Jason Weston,et al. Dialog-based Language Learning , 2016, NIPS.
[18] Tao Li,et al. Prediction Based Execution on Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[19] Yu Cao,et al. Scalable and modularized RTL compilation of Convolutional Neural Networks onto FPGA , 2016, 2016 26th International Conference on Field Programmable Logic and Applications (FPL).
[20] Dan Grossman,et al. Probability type inference for flexible approximate programming , 2015, OOPSLA.
[21] Zellig S. Harris,et al. Distributional Structure , 1954 .
[22] Dan Alistarh,et al. Distributed Learning over Unreliable Networks , 2018, ICML.
[23] Meng-Fan Chang,et al. DL-RSIM: A Simulation Framework to Enable Reliable ReRAM-based Accelerators for Deep Learning , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[24] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[25] Brandon Lucia,et al. Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems , 2018, ASPLOS.
[26] David Wentzlaff,et al. Scaling Datacenter Accelerators with Compute-Reuse Architectures , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[27] Tze Meng Low,et al. SPIRAL: Extreme Performance Portability , 2018, Proceedings of the IEEE.
[28] Diana Marculescu,et al. HyperPower: Power- and memory-constrained hyper-parameter optimization for neural networks , 2017, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[29] Diana Marculescu,et al. LightNN: Filling the Gap between Conventional Deep Neural Networks and Binarized Networks , 2017, ACM Great Lakes Symposium on VLSI.
[30] Gu-Yeon Wei,et al. Cognitive Computing Safety: The New Horizon for Reliability / The Design and Evolution of Deep Learning Workloads , 2017, IEEE Micro.
[31] Jason Weston,et al. Large-scale Simple Question Answering with Memory Networks , 2015, ArXiv.
[32] Jason Weston,et al. Learning End-to-End Goal-Oriented Dialog , 2016, ICLR.
[33] Robert A. van de Geijn,et al. Anatomy of high-performance matrix multiplication , 2008, TOMS.
[34] Massoud Pedram,et al. VIBNN: Hardware Acceleration of Bayesian Neural Networks , 2018, ASPLOS.
[35] Christopher W. Fletcher,et al. Morph: Flexible Acceleration for 3D CNN-Based Video Understanding , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[36] Amar Phanishayee,et al. Gist: Efficient Data Encoding for Deep Neural Network Training , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[37] Hyesoon Kim,et al. StaleLearn: Learning Acceleration with Asynchronous Synchronization Between Model Replicas on PIM , 2018, IEEE Transactions on Computers.
[38] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[39] Tao Li,et al. Towards Efficient Microarchitectural Design for Accelerating Unsupervised GAN-Based Deep Learning , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[40] Diana Marculescu,et al. Hardware-Aware Machine Learning: Modeling and Optimization , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[41] Kunle Olukotun,et al. Understanding and optimizing asynchronous low-precision stochastic gradient descent , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[42] Martin Dyer,et al. Leibniz International Proceedings in Informatics, LIPIcs , 2016, ICALP 2016.
[43] Jose-Maria Arnau,et al. The Dark Side of DNN Pruning , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[44] Pradip Bose,et al. Impact of Software Approximations on the Resiliency of a Video Summarization System , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
[45] Abdullah Muzahid,et al. Approximeter: Automatically finding and quantifying code sections for approximation , 2017, 2017 IEEE International Symposium on Workload Characterization (IISWC).
[46] David Blaauw,et al. Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[47] Gu-Yeon Wei,et al. A case for efficient accelerator design space exploration via Bayesian optimization , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[48] Yiran Chen,et al. A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[49] Onur Mutlu,et al. RFVP: Rollback-Free Value Prediction with Safe-to-Approximate Loads , 2016, ACM Trans. Archit. Code Optim..
[50] Gu-Yeon Wei,et al. Ares: A framework for quantifying the resilience of deep neural networks , 2018, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[51] Gu-Yeon Wei,et al. The Aladdin Approach to Accelerator Design and Modeling , 2015, IEEE Micro.
[52] Jing Wang,et al. In-Situ AI: Towards Autonomous and Incremental Deep Learning for IoT Systems , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[53] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[54] Dan Alistarh,et al. Model compression via distillation and quantization , 2018, ICLR.
[55] Lingjia Tang,et al. The Architectural Implications of Autonomous Driving: Constraints and Acceleration , 2018, ASPLOS.
[56] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[57] Olivier Temam,et al. A defect-tolerant accelerator for emerging high-performance applications , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[58] Alexander M. Rush,et al. Weightless: Lossy Weight Encoding For Deep Neural Network Compression , 2018, ICML.
[59] Qian Wang,et al. AUGEM: Automatically generate high performance Dense Linear Algebra kernels on x86 CPUs , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
[60] Yiran Chen,et al. ReCom: An efficient resistive accelerator for compressed deep neural networks , 2018, 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[61] Suren Jayasuriya,et al. EVA²: Exploiting Temporal Redundancy in Live Computer Vision , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[62] Xiang Zhang,et al. Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems , 2015, ICLR.
[63] Jose-Maria Arnau,et al. Computation Reuse in DNNs by Exploiting Input Similarity , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[64] Daehyun Kim,et al. μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization , 2019, EuroSys.
[65] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[66] Saibal Mukhopadhyay,et al. A programmable hardware accelerator for simulating dynamical systems , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[67] Scott A. Mahlke,et al. DeftNN: Addressing Bottlenecks for DNN Execution on GPUs via Synapse Vector Elimination and Near-compute Data Fission , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[68] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.
[69] Eunhyeok Park,et al. Energy-Efficient Neural Network Accelerator Based on Outlier-Aware Low-Precision Computation , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[70] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[71] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[72] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[73] Mark Davies. The Corpus of Contemporary American English (COCA) , 2012 .
[74] Jason Weston,et al. The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations , 2015, ICLR.
[75] Jason Weston,et al. Dialogue Learning With Human-In-The-Loop , 2016, ICLR.
[76] Engin Ipek,et al. Enabling Scientific Computing on Memristive Accelerators , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[77] Eriko Nurvitadhi,et al. Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks? , 2017, FPGA.
[78] Diana Marculescu,et al. Designing Adaptive Neural Networks for Energy-Constrained Image Classification , 2018, 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[79] Jose-Maria Arnau,et al. UNFOLD: A Memory-Efficient Speech Recognizer Using On-The-Fly WFST Composition , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[80] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[81] Luis Ceze,et al. Hardware-Software Co-Design: Not Just a Cliché , 2015, SNAPL.
[82] Ying Ma,et al. A Taxonomy for Neural Memory Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[83] Engin Ipek,et al. Making Memristive Neural Network Accelerators Reliable , 2018, 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA).