Efficient Processing of Deep Neural Networks
暂无分享,去创建一个
[1] S. Belloni,et al. DeepBench , 2022, Proceedings of the 2022 workshop on 9th International Workshop of Testing Database Systems.
[2] Candace Moore,et al. Deep learning frameworks , 2021, Radiopaedia.org.
[3] Vivienne Sze,et al. An Architecture-Level Energy and Area Estimator for Processing-In-Memory Accelerator Designs , 2020, 2020 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[4] Fjodor van Veen,et al. The Neural Network Zoo , 2020, Proceedings.
[5] Dirk Englund,et al. Digital Optical Neural Networks for Large-Scale Machine Learning , 2020, 2020 Conference on Lasers and Electro-Optics (CLEO).
[6] Jose Javier Gonzalez Ortiz,et al. What is the State of Neural Network Pruning? , 2020, MLSys.
[7] Michael Carbin,et al. Comparing Rewinding and Fine-tuning in Neural Network Pruning , 2020, ICLR.
[8] Luca P. Carloni,et al. Silicon Photonics Codesign for Deep Learning , 2020, Proceedings of the IEEE.
[9] Neurosurgeon , 2020, Definitions.
[10] Jonathan Chang,et al. 15.3 A 351TOPS/W and 372.4GOPS Compute-in-Memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications , 2020, 2020 IEEE International Solid- State Circuits Conference - (ISSCC).
[11] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[12] Vivienne Sze,et al. Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).
[13] Gu-Yeon Wei,et al. A binary-activation, multi-level weight RNN and training algorithm for processing-in-memory inference with eNVM , 2019, ArXiv.
[14] Vivienne Sze,et al. Accelergy: An Architecture-Level Energy Estimation Methodology for Accelerator Designs , 2019, 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).
[15] B. Murmann,et al. RRAM-Based In-Memory Computing for Embedded Deep Neural Networks , 2019, 2019 53rd Asilomar Conference on Signals, Systems, and Computers.
[16] Christian Enz,et al. Review and Benchmarking of Precision-Scalable Multiply-Accumulate Unit Architectures for Embedded Neural-Network Processing , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[17] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[18] Aamer Jaleel,et al. ExTensor: An Accelerator for Sparse Tensor Algebra , 2019, MICRO.
[19] David Wentzlaff,et al. ComputeDRAM: In-Memory Compute Using Off-the-Shelf DRAMs , 2019, MICRO.
[20] T. N. Vijaykumar,et al. SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks , 2019, MICRO.
[21] Wei Tang,et al. CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm , 2019, MICRO.
[22] William J. Dally,et al. Simba: Scaling Deep-Learning Inference with Multi-Chip-Module-Based Architecture , 2019, MICRO.
[23] Hongyang Jia,et al. In-Memory Computing: Advances and prospects , 2019, IEEE Solid-State Circuits Magazine.
[24] Wooseok Yi,et al. BitBlade: Area and Energy-Efficient Precision-Scalable Neural Network Accelerator with Bitwise Summation , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[25] P. Bai,et al. Non-Volatile RRAM Embedded into 22FFL FinFET Technology , 2019, 2019 Symposium on VLSI Technology.
[26] William J. Dally,et al. A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator with Ground-Reference Signaling in 16nm , 2019, 2019 Symposium on VLSI Circuits.
[27] Pradeep Dubey,et al. A Study of BFLOAT16 for Deep Learning Training , 2019, ArXiv.
[28] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[29] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[30] Jason Clemons,et al. Buffets: An Efficient and Composable Storage Idiom for Explicit Decoupled Data Orchestration , 2019, ASPLOS.
[31] Christoforos E. Kozyrakis,et al. TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators , 2019, ASPLOS.
[32] Patrick Judd,et al. Bit-Tactical: A Software/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks , 2019, ASPLOS.
[33] Sertac Karaman,et al. FastDepth: Fast Monocular Depth Estimation on Embedded Systems , 2019, 2019 International Conference on Robotics and Automation (ICRA).
[34] Rudy Lauwereins,et al. Sub-Word Parallel Precision-Scalable MAC Engines for Efficient Embedded DNN Inference , 2019, 2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS).
[35] Brucek Khailany,et al. Timeloop: A Systematic Approach to DNN Accelerator Evaluation , 2019, 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).
[36] Erich Elsen,et al. The State of Sparsity in Deep Neural Networks , 2019, ArXiv.
[37] George Papandreou,et al. DeeperLab: Single-Shot Image Parser , 2019, ArXiv.
[38] Arash AziziMazreah,et al. Shortcut Mining: Exploiting Cross-Layer Shortcut Reuse in DCNN Accelerators , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[39] Reetuparna Das,et al. Bit Prudent In-Cache Acceleration of Deep Convolutional Neural Networks , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[40] Lee-Sup Kim,et al. NAND-Net: Minimizing Computational Complexity of In-Memory Processing for Binary Neural Networks , 2019, 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[41] Li Fei-Fei,et al. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[42] Tayfun Gokmen,et al. The Next Generation of Deep Learning Hardware: Analog Computing , 2019, Proceedings of the IEEE.
[43] Tadahiro Kuroda,et al. QUEST: Multi-Purpose Log-Quantized DNN Inference Engine Stacked on 96-MB 3-D SRAM Using Inductive Coupling Technology in 40-nm CMOS , 2019, IEEE Journal of Solid-State Circuits.
[44] Marian Verhelst,et al. An Always-On 3.8 $\mu$ J/86% CIFAR-10 Mixed-Signal Binary CNN Processor With All Memory on Chip in 28-nm CMOS , 2019, IEEE Journal of Solid-State Circuits.
[45] Niraj K. Jha,et al. ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[46] Yuandong Tian,et al. FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[47] T. Ghani,et al. MRAM as Embedded Non-Volatile Memory Solution for 22FFL FinFET Technology , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).
[48] Ryan Hamerly,et al. Large-Scale Optical Neural Networks based on Photoelectric Multiplication , 2018, Physical Review X.
[49] N. Verma,et al. A Microprocessor implemented in 65nm CMOS with Configurable and Bit-scalable Accelerator for Programmable In-memory Computing , 2018, ArXiv.
[50] H. T. Kung,et al. Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization , 2018, ASPLOS.
[51] Mostafa Mahmoud,et al. Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator , 2018, 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[52] Trevor Darrell,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[53] Song Han,et al. ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware , 2018, ICLR.
[54] George Papandreou,et al. Searching for Efficient Multi-Scale Architectures for Dense Image Prediction , 2018, NeurIPS.
[55] Hoi-Jun Yoo,et al. DNPU: An Energy-Efficient Deep-Learning Processor with Heterogeneous Multi-Core Architecture , 2018, IEEE Micro.
[56] Marian Verhelst,et al. Laika: A 5uW Programmable LSTM Accelerator for Always-on Keyword Spotting in 65nm CMOS , 2018, ESSCIRC 2018 - IEEE 44th European Solid State Circuits Conference (ESSCIRC).
[57] Bo Chen,et al. MnasNet: Platform-Aware Neural Architecture Search for Mobile , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[58] Aaron Klein,et al. Towards Automated Deep Learning: Efficient Joint Neural Architecture and Hyperparameter Search , 2018, ArXiv.
[59] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[60] Quoc V. Le,et al. Understanding and Simplifying One-Shot Architecture Search , 2018, ICML.
[61] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[62] Hossein Valavi,et al. A Mixed-Signal Binarized Convolutional-Neural-Network Accelerator Integrating Dense Weight Storage and Multiplication for Reduced Data Movement , 2018, 2018 IEEE Symposium on VLSI Circuits.
[63] Eric S. Chung,et al. A Configurable Cloud-Scale DNN Processor for Real-Time AI , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[64] Leibo Liu,et al. An Ultra-High Energy-Efficient Reconfigurable Processor for Deep Neural Networks with Binary/Ternary Weights in 28NM CMOS , 2018, 2018 IEEE Symposium on VLSI Circuits.
[65] Rajesh K. Gupta,et al. SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[66] Tao Li,et al. Prediction Based Execution on Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[67] Jose-Maria Arnau,et al. Computation Reuse in DNNs by Exploiting Input Similarity , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[68] J. Hennessy. A new golden age for computer architecture: Domain-specific hardware/software co-design, enhanced security, open instruction sets, and agile chip development , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[69] Aleksander Madry,et al. How Does Batch Normalization Help Optimization? (No, It Is Not About Internal Covariate Shift) , 2018, NeurIPS.
[70] Nam Sung Kim,et al. GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[71] David Blaauw,et al. Neural Cache: Bit-Serial In-Cache Acceleration of Deep Neural Networks , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[72] Hyoukjun Kwon,et al. MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators , 2018, ArXiv.
[73] Shoaib Kamil,et al. Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code , 2018, 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[74] Saman P. Amarasinghe,et al. Format abstraction for sparse tensor algebra compilers , 2018, Proc. ACM Program. Lang..
[75] Sujan Kumar Gonugondla,et al. An In-Memory VLSI Architecture for Convolutional Neural Networks , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[76] Hari Angepat,et al. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave , 2018, IEEE Micro.
[77] Mengjia Yan,et al. UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[78] Yi Luo,et al. All-optical machine learning using diffractive deep neural networks , 2018, Science.
[79] Bo Chen,et al. NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications , 2018, ECCV.
[80] Matthew Mattina,et al. Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[81] Hyoukjun Kwon,et al. MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Reconfigurable Interconnects , 2018, ASPLOS.
[82] Suren Jayasuriya,et al. EVA²: Exploiting Temporal Redundancy in Live Computer Vision , 2018, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[83] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[84] Alok Aggarwal,et al. Regularized Evolution for Image Classifier Architecture Search , 2018, AAAI.
[85] Anantha Chandrakasan,et al. Conv-RAM: An energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[86] Meng-Fan Chang,et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[87] Hoi-Jun Yoo,et al. UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[88] Marian Verhelst,et al. An always-on 3.8μJ/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[89] Shimeng Yu,et al. Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.
[90] Sujan Kumar Gonugondla,et al. A Multi-Functional In-Memory Inference Processor Using a Standard 6T SRAM Array , 2018, IEEE Journal of Solid-State Circuits.
[91] Jonathan Ragan-Kelley,et al. Halide , 2017 .
[92] Hadi Esmaeilzadeh,et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network , 2017, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[93] F. Merrikh Bayat,et al. Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology , 2017, 2017 IEEE International Electron Devices Meeting (IEDM).
[94] Elad Eban,et al. MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[95] Asit K. Mishra,et al. Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy , 2017, ICLR.
[96] Benoît Meister,et al. Polyhedral Optimization of TensorFlow Computation Graphs , 2017, ESPT/VPA@SC.
[97] Xin Wang,et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.
[98] Oriol Vinyals,et al. Hierarchical Representations for Efficient Architecture Search , 2017, ICLR.
[99] Diana Marculescu,et al. NeuralPower: Predict and Deploy Energy-Efficient Convolutional Neural Networks , 2017, ArXiv.
[100] Yuan Xie,et al. DRISA: A DRAM-based Reconfigurable In-Situ Accelerator , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[101] Onur Mutlu,et al. Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology , 2017, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[102] Shoaib Kamil,et al. The tensor algebra compiler , 2017, Proc. ACM Program. Lang..
[103] Joel Emer,et al. A method to estimate the energy consumption of deep neural networks , 2017, 2017 51st Asilomar Conference on Signals, Systems, and Computers.
[104] Gang Sun,et al. Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[105] Eriko Nurvitadhi,et al. WRPN: Wide Reduced-Precision Networks , 2017, ICLR.
[106] Li Shen,et al. Deep Learning to Improve Breast Cancer Detection on Screening Mammography , 2017, Scientific Reports.
[107] Wei Wu,et al. Practical Block-Wise Neural Network Architecture Generation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[108] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[109] Trevor Darrell,et al. Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[110] Jianxin Wu,et al. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[111] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[112] Chen Sun,et al. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[113] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[114] Yong Yu,et al. Efficient Architecture Search by Network Transformation , 2017, AAAI.
[115] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[116] W. Dally,et al. SCNN , 2017 .
[117] David J. Palframan,et al. Scalpel , 2017 .
[118] Patrick Judd,et al. Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks , 2017, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).
[119] Vivienne Sze,et al. Using Dataflow to Optimize Energy Efficiency of Deep Neural Network Accelerators , 2017, IEEE Micro.
[120] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[121] Tadahiro Kuroda,et al. BRein memory: A 13-layer 4.2 K neuron/0.8 M synapse binary/ternary reconfigurable in-memory deep neural network accelerator in 65 nm CMOS , 2017, 2017 Symposium on VLSI Circuits.
[122] Leibo Liu,et al. A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications , 2017, 2017 Symposium on VLSI Circuits.
[123] Meng-Fan Chang,et al. A 462GOPs/J RRAM-based nonvolatile intelligent processor for energy harvesting IoE system featuring nonvolatile logics and processing-in-memory , 2017, 2017 Symposium on VLSI Technology.
[124] William J. Dally,et al. SCNN: An accelerator for compressed-sparse convolutional neural networks , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[125] Charbel Sakr,et al. PredictiveNet: An energy-efficient convolutional neural network via zero prediction , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).
[126] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[127] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[128] Leibo Liu,et al. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[129] Trevor N. Mudge,et al. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.
[130] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[131] Alexei A. Efros,et al. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[132] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[133] Vivienne Sze,et al. Towards closing the energy gap between HOG and CNN features for embedded vision (Invited paper) , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).
[134] Daisuke Miyashita,et al. LogNet: Energy-efficient neural networks using logarithmic computation , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[135] Quoc V. Le,et al. Large-Scale Evolution of Image Classifiers , 2017, ICML.
[136] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[137] Jian Sun,et al. Deep Learning with Low Precision by Half-Wave Gaussian Quantization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[138] Xiaowei Li,et al. FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[139] Yiran Chen,et al. PipeLayer: A Pipelined ReRAM-Based Accelerator for Deep Learning , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[140] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[141] Sebastian Thrun,et al. Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.
[142] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[143] Aaron Klein,et al. Towards Automatically-Tuned Neural Networks , 2016, AutoML@ICML.
[144] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[145] Vivienne Sze,et al. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[146] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[147] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[148] An Chen,et al. A review of emerging non-volatile memory (NVM) technologies and applications , 2016 .
[149] Andreas Moshovos,et al. Bit-Pragmatic Deep Neural Network Computing , 2016, 2017 50th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[150] Vincent Dumoulin,et al. Deconvolution and Checkerboard Artifacts , 2016 .
[151] Shaoli Liu,et al. Cambricon-X: An accelerator for sparse neural networks , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[152] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[153] Amnon Shashua,et al. Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving , 2016, ArXiv.
[154] Roland Siegwart,et al. From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[155] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[156] Christian Ledig,et al. Is the deconvolution layer the same as a convolutional layer? , 2016, ArXiv.
[157] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.
[158] Ying Zhang,et al. Towards End-to-End Speech Recognition with Deep Convolutional Neural Networks , 2016, INTERSPEECH.
[159] Kevin Petrecca,et al. Neural networks improve brain cancer detection with Raman spectroscopy in the presence of operating room light artifacts , 2016, Journal of biomedical optics.
[160] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[161] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[162] Gu-Yeon Wei,et al. Fathom: reference workloads for modern deep learning methods , 2016, 2016 IEEE International Symposium on Workload Characterization (IISWC).
[163] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.
[164] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[165] Xiaoou Tang,et al. Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.
[166] Shuicheng Yan,et al. Training Skinny Deep Neural Networks with Iterative Hard Thresholding Methods , 2016, ArXiv.
[167] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[168] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[169] Sudhakar Yalamanchili,et al. Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[170] Dayong Wang,et al. Deep Learning for Identifying Metastatic Breast Cancer , 2016, ArXiv.
[171] Yu Wang,et al. PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[172] Lin Zhong,et al. RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[173] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[174] Luca Benini,et al. YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).
[175] Naveen Verma,et al. A machine-learning classifier implemented in a standard 6T SRAM array , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).
[176] Marian Verhelst,et al. A 0.3–2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets , 2016, 2016 IEEE Symposium on VLSI Circuits (VLSI-Circuits).
[177] Iasonas Kokkinos,et al. DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[178] Miao Hu,et al. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[179] Nassir Navab,et al. Deeper Depth Prediction with Fully Convolutional Residual Networks , 2016, 2016 Fourth International Conference on 3D Vision (3DV).
[180] David K. Gifford,et al. Convolutional neural network architectures for predicting DNA–protein binding , 2016, Bioinform..
[181] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[182] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[183] Ashok Veeraraghavan,et al. ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks Using Angle Sensitive Pixels , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[184] Vivienne Sze,et al. FAST: A Framework to Accelerate Super-Resolution Processing on Compressed Videos , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[185] Andrew S. Cassidy,et al. Convolutional networks for fast, energy-efficient neuromorphic computing , 2016, Proceedings of the National Academy of Sciences.
[186] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.
[187] Gökmen Tayfun,et al. Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices: Design Considerations , 2016, Front. Neurosci..
[188] Matthew Richardson,et al. Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.
[189] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[190] Gert Cauwenberghs,et al. Neuromorphic architectures with electronic synapses , 2016, 2016 17th International Symposium on Quality Electronic Design (ISQED).
[191] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[192] Daisuke Miyashita,et al. Convolutional Neural Networks using Logarithmic Data Representation , 2016, ArXiv.
[193] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .
[194] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[195] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[196] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[197] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.
[198] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[199] V. Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2016, IEEE Journal of Solid-State Circuits.
[200] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[201] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[202] Margaret Martonosi,et al. DeSC: Decoupled supply-compute communication management for heterogeneous architectures , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[203] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[204] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[205] Vladlen Koltun,et al. Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.
[206] Eunhyeok Park,et al. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications , 2015, ICLR.
[207] George Karypis,et al. Tensor-matrix products with a compressed sparse tensor , 2015, IA3@SC.
[208] Yoshua Bengio,et al. BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.
[209] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[210] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[211] Sergey Levine,et al. Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).
[212] O. Troyanskaya,et al. Predicting effects of noncoding variants with deep learning–based sequence model , 2015, Nature Methods.
[213] B. Frey,et al. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning , 2015, Nature Biotechnology.
[214] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[215] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[216] Luca P. Carloni,et al. An analysis of accelerator coupling in heterogeneous architectures , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[217] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[218] Luca Benini,et al. Origami: A Convolutional Network Accelerator , 2015, ACM Great Lakes Symposium on VLSI.
[219] Seunghoon Hong,et al. Learning Deconvolution Network for Semantic Segmentation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[220] Jianxiong Xiao,et al. DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[221] Yixin Chen,et al. Compressing Neural Networks with the Hashing Trick , 2015, ICML.
[222] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[223] Hoi-Jun Yoo,et al. 4.6 A1.93TOPS/W scalable deep learning/inference processor with tetra-parallel MIMD architecture for big-data applications , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.
[224] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[225] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[226] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[227] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[228] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[229] Naveen Verma,et al. 18.4 A matrix-multiplying ADC implementing a machine-learning classifier directly with data conversion , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.
[230] B. Frey,et al. The human splicing code reveals new insights into the genetic determinants of disease , 2015, Science.
[231] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[232] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[233] Ivan V. Oseledets,et al. Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition , 2014, ICLR.
[234] Samira Ebrahimi Kahou,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[235] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.
[236] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[237] Farnood Merrikh-Bayat,et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors , 2014, Nature.
[238] Thomas Brox,et al. Learning to Generate Chairs, Tables and Cars with Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[239] Trevor Darrell,et al. Fully convolutional networks for semantic segmentation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[240] Paul R. Prucnal,et al. Broadcast and Weight: An Integrated Network For Scalable Photonic Spike Processing , 2014, Journal of Lightwave Technology.
[241] John Tran,et al. cuDNN: Efficient Primitives for Deep Learning , 2014, ArXiv.
[242] Vivienne Sze,et al. Energy-efficient HOG-based object detection at 1080HD 60 fps with multi-scale support , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).
[243] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[244] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[245] Xiaoou Tang,et al. Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.
[246] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[247] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[248] Andrew S. Cassidy,et al. A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.
[249] Berin Martini,et al. A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[250] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[251] Aaron C. Courville,et al. Generative Adversarial Networks , 2014, 1406.2661.
[252] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[253] Jason Cong,et al. Accelerator-rich architectures: Opportunities and progresses , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).
[254] Xiaohui Zhang,et al. Improving deep neural network acoustic models using generalized maxout networks , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[255] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.
[256] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[257] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[258] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[259] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[260] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[261] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[262] Qiang Chen,et al. Network In Network , 2013, ICLR.
[263] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[264] Henk Corporaal,et al. Memory-centric accelerator design for Convolutional Neural Networks , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).
[265] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[266] Frédo Durand,et al. Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines , 2013, PLDI.
[267] Geoffrey Zweig,et al. Recent advances in deep learning for speech research at Microsoft , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[268] Tara N. Sainath,et al. Deep convolutional neural networks for LVCSR , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[269] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[270] Albert Wang,et al. A 180nm CMOS image sensor with on-chip optoelectronic image compression , 2012, Proceedings of the IEEE 2012 Custom Integrated Circuits Conference.
[271] François Fleuret,et al. Exact Acceleration of Linear Object Detectors , 2012, ECCV.
[272] J. Jeddeloh,et al. Hybrid memory cube new DRAM architecture increases density and performance , 2012, 2012 Symposium on VLSI Technology (VLSIT).
[273] Christoforos E. Kozyrakis,et al. Towards energy-proportional datacenter memory with mobile DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).
[274] B. Ramkumar,et al. Low-Power and Area-Efficient Carry Select Adder , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[275] Graham W. Taylor,et al. Adaptive deconvolutional networks for mid and high level feature learning , 2011, 2011 International Conference on Computer Vision.
[276] Honglak Lee,et al. Unsupervised learning of hierarchical representations with convolutional deep belief networks , 2011, Commun. ACM.
[277] Bill Dally,et al. Power, Programmability, and Granularity: The Challenges of ExaScale Computing , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.
[278] Heng-Yuan Lee,et al. A 4Mb embedded SLC resistive-RAM macro with 7.2ns read-write random-access time and 160ns MLC-access capability , 2011, 2011 IEEE International Solid-State Circuits Conference.
[279] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[280] Wayne Luk,et al. Towards an embedded biologically-inspired machine vision processor , 2010, 2010 International Conference on Field-Programmable Technology.
[281] Jiale Liang,et al. Cross-Point Memory Array Without Cell Selectors—Device Characteristics and Data Storage Pattern Dependencies , 2010, IEEE Transactions on Electron Devices.
[282] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[283] Aamer Jaleel,et al. High performance cache replacement using re-reference interval prediction (RRIP) , 2010, ISCA.
[284] Srihari Cadambi,et al. A dynamically configurable coprocessor for convolutional neural networks , 2010, ISCA.
[285] Roberto Bez,et al. A 90nm 4Mb embedded phase-change memory with 1.2V 12ns read access time and 1MB/s write throughput , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).
[286] William J. Dally,et al. The GPU Computing Era , 2010, IEEE Micro.
[287] Srihari Cadambi,et al. A Massively Parallel Coprocessor for Convolutional Neural Networks , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.
[288] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[289] Antonio Torralba,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .
[290] John R. Gilbert,et al. On the representation and multiplication of hypersparse matrices , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.
[291] Rich Caruana,et al. Model compression , 2006, KDD '06.
[292] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[293] Bernard Carlos Widrow,et al. Thinking about thinking: the discovery of the LMS algorithm , 2005, IEEE Signal Process. Mag..
[294] 裕幸 飯田,et al. International Technology Roadmap for Semiconductors 2003の要求清浄度について - シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について - , 2004 .
[295] Norbert Wehn,et al. Embedded DRAM Development: Technology, Physical Design, and Application Issues , 2001, IEEE Des. Test Comput..
[296] Amy Hsiu-Fen Chou,et al. Flash Memories , 2000, The VLSI Handbook.
[297] S. Hochreiter,et al. Long Short-Term Memory , 1997, Neural Computation.
[298] Volker Tresp,et al. Early Brain Damage , 1996, NIPS.
[299] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[300] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[301] Russell Reed,et al. Pruning algorithms-a survey , 1993, IEEE Trans. Neural Networks.
[302] Joan L. Mitchell,et al. JPEG: Still Image Data Compression Standard , 1992 .
[303] D. Williamson. Dynamically scaled fixed point arithmetic , 1991, [1991] IEEE Pacific Rim Conference on Communications, Computers and Signal Processing Conference Proceedings.
[304] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[305] David H. Bailey,et al. Using Strassen's algorithm to accelerate the solution of linear systems , 1991, The Journal of Supercomputing.
[306] Ehud D. Karnin,et al. A simple procedure for pruning back-propagation trained neural networks , 1990, IEEE Trans. Neural Networks.
[307] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[308] I. Guyon,et al. Handwritten digit recognition: applications of neural network chips and automatic learning , 1989, IEEE Communications Magazine.
[309] Janowsky,et al. Pruning versus clipping in neural networks. , 1989, Physical review. A, General physics.
[310] N. Takagi,et al. A high-speed multiplier using a redundant binary adder tree , 1987 .
[311] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[312] James E. Smith. Decoupled access/execute computer architectures , 1982, ISCA '98.
[313] Nicolas Halbwachs,et al. Automatic discovery of linear restraints among variables of a program , 1978, POPL.
[314] Leslie Lamport,et al. The parallel execution of DO loops , 1974, CACM.
[315] L. Chua. Memristor-The missing circuit element , 1971 .
[316] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[317] Richard M. Karp,et al. The Organization of Computations for Uniform Recurrence Equations , 1967, JACM.
[318] William F. Tinney,et al. Techniques for Exploiting the Sparsity or the Network Admittance Matrix , 1963 .
[319] J. Little. A Proof for the Queuing Formula: L = λW , 1961 .
[320] Mary Wootters,et al. The N3XT Approach to Energy-Efficient Abundant-Data Computing , 2019, Proceedings of the IEEE.
[321] Wafer-Scale Deep Learning , 2019, 2019 IEEE Hot Chips 31 Symposium (HCS).
[322] A. Parashar,et al. Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks , 2018 .
[323] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[324] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018 .
[325] J. Emer,et al. Understanding the Limitations of Existing Energy-Efficient Design Approaches for Deep Neural Networks , 2018 .
[326] Dirk Englund,et al. Deep learning with coherent nanophotonic circuits , 2017, 2017 Fifth Berkeley Symposium on Energy Efficient Electronic Systems & Steep Transistors Workshop (E3S).
[327] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[328] Tsu-Jae King Liu,et al. There's plenty of room at the top , 2017, 2017 IEEE 30th International Conference on Micro Electro Mechanical Systems (MEMS).
[329] Vivienne Sze,et al. Hardware for machine learning: Challenges and opportunities , 2017, 2017 IEEE Custom Integrated Circuits Conference (CICC).
[330] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[331] S. Simon Wong,et al. 24.2 A 2.5GHz 7.7TOPS/W switched-capacitor matrix multiplier with co-designed local memory in 40nm , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).
[332] Mathias Beike,et al. Digital Integrated Circuits A Design Perspective , 2016 .
[333] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[334] D. Ditzel,et al. Low-cost 3D chip stacking with ThruChip wireless connections , 2014, 2014 IEEE Hot Chips 26 Symposium (HCS).
[335] Endong Wang,et al. Intel Math Kernel Library , 2014 .
[336] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[337] Itu-T and Iso Iec Jtc. Advanced video coding for generic audiovisual services , 2010 .
[338] A. Krizhevsky. Convolutional Deep Belief Networks on CIFAR-10 , 2010 .
[339] Samuel Williams,et al. Roofline: An Insightful Visual Performance Model for Floating-Point Programs and Multicore Architectures , 2008 .
[340] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .
[341] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[342] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[343] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[344] Jorge Herbert de Lira,et al. Two-Dimensional Signal and Image Processing , 1989 .
[345] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[346] Michael C. Mozer,et al. Using Relevance to Reduce Network Size Automatically , 1989 .
[347] Bernard Widrow,et al. Adaptive switching circuits , 1988 .
[348] S. Winograd. Arithmetic complexity of computations , 1980 .
[349] Xiaomei Yang. Rounding Errors in Algebraic Processes , 1964, Nature.