暂无分享,去创建一个
Luca Benini | Andrea Bonetti | Erfan Azarkhish | Petar Jokic | Marc Pons | Stephane Emery | L. Benini | M. Pons | Andrea Bonetti | Petar Jokic | E. Azarkhish | S. Emery
[1] Yandong Luo,et al. Robust Processing-In-Memory With Multibit ReRAM Using Hessian-Driven Mixed-Precision Computation , 2021, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[2] Michael W. Mahoney,et al. A Survey of Quantization Methods for Efficient Neural Network Inference , 2021, Low-Power Computer Vision.
[3] L. Benini,et al. CUTIE: Beyond PetaOp/s/W Ternary DNN Inference Acceleration With Better-Than-Binary Energy Efficiency , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[4] Doe Hyun Yoon,et al. The Design Process for Google's Training Chips: TPUv2 and TPUv3 , 2021, IEEE Micro.
[5] Marian Verhelst,et al. High-Utilization, High-Flexibility Depth-First CNN Coprocessor for Image Pixel Processing on FPGA , 2021, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[6] H. Lv,et al. 24.2 A 14nm-FinFET 1Mb Embedded 1T1R RRAM with a 0.022µ m2 Cell Size Using Self-Adaptive Delayed Termination and Multi-Cell Reference , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[7] Chung-Chuan Lo,et al. 16.3 A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips , 2021, 2021 IEEE International Solid- State Circuits Conference (ISSCC).
[8] Sung Kyu Lim,et al. Heterogeneous Mixed-Signal Monolithic 3-D In-Memory Computing Using Resistive RAM , 2021, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[9] Dan Alistarh,et al. Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks , 2021, J. Mach. Learn. Res..
[10] Boris Murmann,et al. Mixed-Signal Computing for Deep Neural Network Inference , 2021, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[11] Improving Memory Utilization in Convolutional Neural Network Accelerators , 2020, IEEE Embedded Systems Letters.
[12] L. Sousa. Nonconventional Computer Arithmetic Circuits, Systems and Applications , 2021, IEEE Circuits and Systems Magazine.
[13] Muhammad Shafique,et al. Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead , 2020, IEEE Access.
[14] Hoi-Jun Yoo,et al. The Development of Silicon for AI: Different Design Approaches , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.
[15] D. Blaauw,et al. A$\mu$Processor Layer for mm-Scale Die-Stacked Sensing Platforms Featuring Ultra-Low Power Sleep Mode at 125°C , 2020, 2020 IEEE Asian Solid-State Circuits Conference (A-SSCC).
[16] Oliver Bringmann,et al. UltraTrail: A Configurable Ultralow-Power TC-ResNet AI Accelerator for Efficient Keyword Spotting , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[17] Shouyi Yin,et al. Efficient Scheduling of Irregular Network Structures on CNN Accelerators , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[18] Luca Benini,et al. Modular Design and Optimization of Biomedical Applications for Ultralow Power Heterogeneous Platforms , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[19] C. Ilas,et al. Towards real-time and real-life image classification and detection using CNN: a review of practical applications requirements, algorithms, hardware and current trends , 2020, 2020 IEEE 26th International Symposium for Design and Technology in Electronic Packaging (SIITME).
[20] Xiaoming Li,et al. Fast Convolutional Neural Networks with Fine-Grained FFTs , 2020, PACT.
[21] Jaeyoung Park,et al. Neuromorphic Computing Using Emerging Synaptic Devices: A Retrospective Summary and an Outlook , 2020, Electronics.
[22] Jeremy Kepner,et al. Survey of Machine Learning Accelerators , 2020, 2020 IEEE High Performance Extreme Computing Conference (HPEC).
[23] Honglan Jiang,et al. Approximate Arithmetic Circuits: A Survey, Characterization, and Recent Applications , 2020, Proceedings of the IEEE.
[24] Eunhyeok Park,et al. McDRAM v2: In-Dynamic Random Access Memory Systolic Array Accelerator to Address the Large Model Problem in Deep Neural Networks on the Edge , 2020, IEEE Access.
[25] Shapeshifter Networks: Decoupling Layers from Parameters for Scalable and Effective Deep Learning. , 2020, 2006.10598.
[26] Fabien Clermidy,et al. SamurAI: A 1.7MOPS-36GOPS Adaptive Versatile IoT Node with 15,000× Peak-to-Idle Power Reduction, 207ns Wake-Up Time and 1.3TOPS/W ML Efficiency , 2020, 2020 IEEE Symposium on VLSI Circuits.
[27] Pedram Pad,et al. Efficient Neural Vision Systems Based on Convolutional Image Acquisition , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[28] Sven Beyer,et al. FeFET: A versatile CMOS compatible device with game-changing potential , 2020, 2020 IEEE International Memory Workshop (IMW).
[29] J. Crowcroft,et al. Edge Intelligence: Architectures, Challenges, and Applications , 2020 .
[30] Yuan Xie,et al. Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey , 2020, Proceedings of the IEEE.
[31] David Patterson,et al. Benchmarking TinyML Systems: Challenges and Direction , 2020, ArXiv.
[32] Xiaochen Peng,et al. Compute-in-Memory with Emerging Nonvolatile-Memories: Challenges and Prospects , 2020, 2020 IEEE Custom Integrated Circuits Conference (CICC).
[33] Yiran Chen,et al. A Survey of Accelerator Architectures for Deep Neural Networks , 2020 .
[34] Massimo Alioto,et al. Low-Energy Voice Activity Detection via Energy-Quality Scaling From Data Conversion to Machine Learning , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.
[35] Nirali R. Nanavati,et al. Efficient Hardware Implementations of Deep Neural Networks: A Survey , 2020, 2020 Fourth International Conference on Inventive Systems and Control (ICISC).
[36] Matthew Mattina,et al. Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference , 2020, IEEE Computer Architecture Letters.
[37] Cody Coleman,et al. MLPerf Inference Benchmark , 2019, 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA).
[38] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[39] Massimo Alioto,et al. Energy-Quality Scalable Memory-Frugal Feature Extraction for Always-On Deep Sub-mW Distributed Vision , 2020, IEEE Access.
[40] Pijush Kanti Dutta Pramanik,et al. Power Consumption Analysis, Measurement, Management, and Issues: A State-of-the-Art Review of Smartphone Battery and Energy Usage , 2019, IEEE Access.
[41] Daniele Paolo Scarpazza,et al. Dissecting the Graphcore IPU Architecture via Microbenchmarking , 2019, ArXiv.
[42] S. O. Park,et al. 1Gbit High Density Embedded STT-MRAM in 28nm FDSOI Technology , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).
[43] Xiaochen Peng,et al. DNN+NeuroSim: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators with Versatile Device Technologies , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).
[44] Vivienne Sze,et al. Design Considerations for Efficient Deep Neural Networks on Processing-in-Memory Accelerators , 2019, 2019 IEEE International Electron Devices Meeting (IEDM).
[45] Yandong Luo,et al. Monolithically Integrated RRAM- and CMOS-Based In-Memory Computing Optimizations for Efficient Deep Learning , 2019, IEEE Micro.
[46] Christian Enz,et al. Review and Benchmarking of Precision-Scalable Multiply-Accumulate Unit Architectures for Embedded Neural-Network Processing , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[47] Kaushik Roy,et al. Towards spike-based machine intelligence with neuromorphic computing , 2019, Nature.
[48] Hyeryung Jang,et al. An Introduction to Probabilistic Spiking Neural Networks: Probabilistic Models, Learning Rules, and Applications , 2019, IEEE Signal Processing Magazine.
[49] Luc Van Gool,et al. AI Benchmark: All About Deep Learning on Smartphones in 2019 , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).
[50] Matthias Bethge,et al. Engineering a Less Artificial Intelligence , 2019, Neuron.
[51] Sen Song,et al. Bridging Biological and Artificial Neural Networks with Emerging Neuromorphic Devices: Fundamentals, Progress, and Challenges , 2019, Advanced materials.
[52] Jeremy Kepner,et al. Survey and Benchmarking of Machine Learning Accelerators , 2019, 2019 IEEE High Performance Extreme Computing Conference (HPEC).
[53] Martin Trentzsch,et al. Design and Analysis of an Ultra-Dense, Low-Leakage, and Fast FeFET-Based Random Access Memory Array , 2019, IEEE Journal on Exploratory Solid-State Computational Devices and Circuits.
[54] Zhengya Zhang,et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations , 2019, Nature Electronics.
[55] Hugo Van hamme,et al. 18μW SoC for near-microphone Keyword Spotting and Speaker Verification , 2019, 2019 Symposium on VLSI Circuits.
[56] Xu Chen,et al. Edge Intelligence: Paving the Last Mile of Artificial Intelligence With Edge Computing , 2019, Proceedings of the IEEE.
[57] Jeremy Kepner,et al. AI Enabling Technologies: A Survey , 2019, ArXiv.
[58] Hoi-Jun Yoo,et al. An Ultra-Low-Power Analog-Digital Hybrid CNN Face Recognition Processor Integrated with a CIS for Always-on Mobile Devices , 2019, 2019 IEEE International Symposium on Circuits and Systems (ISCAS).
[59] George K. Thiruvathukal,et al. Low-Power Computer Vision: Status, Challenges, and Opportunities , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[60] Andreas Burg,et al. A 0.5 V 2.5 μW/MHz Microcontroller with Analog-Assisted Adaptive Body Bias PVT Compensation with 3.13nW/kB SRAM Retention in 55nm Deeply-Depleted Channel CMOS , 2019, 2019 IEEE Custom Integrated Circuits Conference (CICC).
[61] Marian Verhelst,et al. Breaking High-Resolution CNN Bandwidth Barriers With Enhanced Depth-First Execution , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[62] Meng-Fan Chang,et al. 24.5 A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[63] David Blaauw,et al. IoT2 — the Internet of Tiny Things: Realizing mm-Scale Sensors through 3D Die Stacking , 2019, 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[64] Jack Xin,et al. Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets , 2019, ICLR.
[65] Luca Benini,et al. Optimally Scheduling CNN Convolutions for Efficient Memory Access , 2019, ArXiv.
[66] Pulkit Jain,et al. 13.3 A 7Mb STT-MRAM in 22FFL FinFET Technology with 4ns Read Sensing Time at 0.9V Using Write-Verify-Write Scheme and Offset-Cancellation Sensing Technique , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[67] Meng-Fan Chang,et al. 24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[68] Pulkit Jain,et al. 13.2 A 3.6Mb 10.1Mb/mm2 Embedded Non-Volatile ReRAM Macro in 22nm FinFET Technology with Adaptive Forming/Set/Reset Schemes Yielding Down to 0.5V with Sensing Time of 5ns at 0.7V , 2019, 2019 IEEE International Solid- State Circuits Conference - (ISSCC).
[69] Hoi-Jun Yoo,et al. UNPU: An Energy-Efficient Deep Neural Network Accelerator With Fully Variable Weight Bit Precision , 2019, IEEE Journal of Solid-State Circuits.
[70] Frank Hutter,et al. Neural Architecture Search: A Survey , 2018, J. Mach. Learn. Res..
[71] Xindong Wu,et al. Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.
[72] Vivienne Sze,et al. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[73] Luca Benini,et al. Hyperdrive: A Multi-Chip Systolically Scalable Binary-Weight CNN Inference Engine , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[74] Alessandro Aimar,et al. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[75] Tayfun Gokmen,et al. The Next Generation of Deep Learning Hardware: Analog Computing , 2019, Proceedings of the IEEE.
[76] S. H. Han,et al. Demonstration of Highly Manufacturable STT-MRAM Embedded in 28nm Logic , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).
[77] Carlos H. Diaz,et al. A 40nm Low-Power Logic Compatible Phase Change Memory Technology , 2018, 2018 IEEE International Electron Devices Meeting (IEDM).
[78] Massimo Alioto,et al. Energy-Quality Scalable Integrated Circuits and Systems: Continuing Energy Scaling in the Twilight of Moore’s Law , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[79] Matthew Mattina,et al. SCALE-Sim: Systolic CNN Accelerator , 2018, ArXiv.
[80] Andreas Gerstlauer,et al. DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[81] Sparsh Mittal,et al. A survey of FPGA-based accelerators for convolutional neural networks , 2018, Neural Computing and Applications.
[82] Alexander Fish,et al. A 14.3pW Sub-Threshold 2T Gain-Cell eDRAM for Ultra-Low Power IoT Applications in 28nm FD-SOI , 2018, 2018 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S).
[83] Hoi-Jun Yoo,et al. DNPU: An Energy-Efficient Deep-Learning Processor with Heterogeneous Multi-Core Architecture , 2018, IEEE Micro.
[84] Dylan Malone Stuart,et al. Memory Requirements for Convolutional Neural Network Hardware Accelerators , 2018, 2018 IEEE International Symposium on Workload Characterization (IISWC).
[85] Masoud Dehyadegari,et al. Designing Efficient Imprecise Adders using Multi-bit Approximate Building Blocks , 2018, ISLPED.
[86] Luca Benini,et al. XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[87] Hui Liu,et al. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework , 2018, MobiSys.
[88] Marian Verhelst,et al. Bit Error Tolerance of a CIFAR-10 Binarized Convolutional Neural Network Processor , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).
[89] Sujan Kumar Gonugondla,et al. An In-Memory VLSI Architecture for Convolutional Neural Networks , 2018, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[90] Boris Murmann,et al. BinarEye: An always-on energy-accuracy-scalable binary CNN processor with all memory on chip in 28nm CMOS , 2018, 2018 IEEE Custom Integrated Circuits Conference (CICC).
[91] Alexander Fish,et al. A 4-Transistor nMOS-Only Logic-Compatible Gain-Cell Embedded DRAM With Over 1.6-ms Retention Time at 700 mV in 28-nm FD-SOI , 2018, IEEE Transactions on Circuits and Systems I: Regular Papers.
[92] Tadahiro Kuroda,et al. QUEST: A 7.49TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[93] Marian Verhelst,et al. An always-on 3.8μJ/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28nm CMOS , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[94] Chung-Cheng Chou,et al. An N40 256K×44 embedded RRAM macro with SL-precharge SA and low-voltage current limiter to improve read and write performance , 2018, 2018 IEEE International Solid - State Circuits Conference - (ISSCC).
[95] Shimeng Yu,et al. Neuro-Inspired Computing With Emerging Nonvolatile Memorys , 2018, Proceedings of the IEEE.
[96] Vikas Chandra,et al. CMSIS-NN: Efficient Neural Network Kernels for Arm Cortex-M CPUs , 2018, ArXiv.
[97] Hong Wang,et al. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.
[98] Asit K. Mishra,et al. Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy , 2017, ICLR.
[99] Steven J. Plimpton,et al. Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator , 2017, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[100] Jia Deng,et al. Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution , 2017, AAAI.
[101] Luca Benini,et al. YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[102] Hoi-Jun Yoo,et al. A Low-Power Convolutional Neural Network Face Recognition Processor and a CIS Integrated With Always-on Face Detector , 2018, IEEE Journal of Solid-State Circuits.
[103] Yundong Zhang,et al. Hello Edge: Keyword Spotting on Microcontrollers , 2017, ArXiv.
[104] M. Pons,et al. PVT compensation in Mie Fujitsu 55 nm DDC: A standard-cell library based comparison , 2017, 2017 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S).
[105] Huazhong Yang,et al. CORAL: Coarse-grained reconfigurable architecture for Convolutional Neural Networks , 2017, 2017 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[106] Yongqiang Lyu,et al. Approximate Computing for Low Power and Security in the Internet of Things , 2017, Computer.
[107] H. T. Kung,et al. Distributed Deep Neural Networks Over the Cloud, the Edge and End Devices , 2017, 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS).
[108] O. Weber. FDSOI vs FinFET: differentiating device features for ultra low power & IoT applications , 2017, 2017 IEEE International Conference on IC Design and Technology (ICICDT).
[109] Catherine D. Schuman,et al. A Survey of Neuromorphic Computing and Neural Networks in Hardware , 2017, ArXiv.
[110] Hoi-Jun Yoo,et al. An energy-efficient deep learning processor with heterogeneous multi-core architecture for convolutional neural networks and recurrent neural networks , 2017, 2017 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS).
[111] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[112] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[113] Luca Benini,et al. CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data , 2017, ICDSC.
[114] Mingyu Gao,et al. TETRIS , 2017 .
[115] Christoforos E. Kozyrakis,et al. TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory , 2017, ASPLOS.
[116] Shengen Yan,et al. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs , 2017, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM).
[117] Vivienne Sze,et al. Efficient Processing of Deep Neural Networks: A Tutorial and Survey , 2017, Proceedings of the IEEE.
[118] Zhuo Wang,et al. In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array , 2017, IEEE Journal of Solid-State Circuits.
[119] Marian Verhelst,et al. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[120] David Blaauw,et al. 14.7 A 288µW programmable deep-learning processor with 270KB on-chip weight storage using non-uniform memory hierarchy for mobile intelligence , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[121] Youchang Kim,et al. 14.6 A 0.62mW ultra-low-power convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).
[122] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.
[123] Vivienne Sze,et al. Hardware for machine learning: Challenges and opportunities , 2017, 2017 IEEE Custom Integrated Circuits Conference (CICC).
[124] Vivienne Sze,et al. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[125] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[126] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[127] V. Sze,et al. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks , 2016, IEEE Journal of Solid-State Circuits.
[128] Qian Wang,et al. A novel data format for approximate arithmetic computing , 2017, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC).
[129] J. Mazurier,et al. 22nm FDSOI technology for emerging mobile, Internet-of-Things, and RF applications , 2016, 2016 IEEE International Electron Devices Meeting (IEDM).
[130] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[131] An Chen,et al. A review of emerging non-volatile memory (NVM) technologies and applications , 2016 .
[132] Boris Murmann,et al. An 8-bit, 16 input, 3.2 pJ/op switched-capacitor dot product circuit in 28-nm FDSOI CMOS , 2016, 2016 IEEE Asian Solid-State Circuits Conference (A-SSCC).
[133] Manoj Alwani,et al. Fused-layer CNN accelerators , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[134] Luca Benini,et al. Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement , 2016, TODE.
[135] Jean-Luc Nagel,et al. Sub-threshold latch-based icyflex2 32-bit processor with wide supply range operation , 2016, 2016 46th European Solid-State Device Research Conference (ESSDERC).
[136] Benton H. Calhoun,et al. A 55nm Ultra Low Leakage Deeply Depleted Channel technology optimized for energy minimization in subthreshold SRAM and logic , 2016, ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference.
[137] Natalie D. Enright Jerger,et al. Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[138] Lin Zhong,et al. RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[139] Xuan Yang,et al. A Systematic Approach to Blocking Convolutional Neural Networks , 2016, ArXiv.
[140] Weisong Shi,et al. Edge Computing: Vision and Challenges , 2016, IEEE Internet of Things Journal.
[141] Vivienne Sze,et al. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[142] Ashok Veeraraghavan,et al. ASP Vision: Optically Computing the First Layer of Convolutional Neural Networks Using Angle Sensitive Pixels , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[143] Luca Benini,et al. High-efficiency logarithmic number unit design based on an improved cotransformation scheme , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[144] Hoi-Jun Yoo,et al. 14.1 A 126.1mW real-time natural UI/UX processor with embedded deep-learning core for low-power smart glasses , 2016, 2016 IEEE International Solid-State Circuits Conference (ISSCC).
[145] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[146] Soheil Ghiasi,et al. Hardware-oriented Approximation of Convolutional Neural Networks , 2016, ArXiv.
[147] Yoshua Bengio,et al. BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1 , 2016, ArXiv.
[148] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[149] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[150] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[151] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[152] Andrew Lavin,et al. Fast Algorithms for Convolutional Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[153] Kaushik Roy,et al. Conditional Deep Learning for energy-efficient and enhanced pattern recognition , 2015, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).
[154] Bankman Daniel,et al. An 8-bit, 16 input, 3.2 pJ/op switched-capacitor dot product circuit in 28-nm FDSOI CMOS , 2016 .
[155] Youchang Kim,et al. A 2.71 nJ/Pixel Gaze-Activated Object Recognition System for Low-Power Mobile Smart Glasses , 2016, IEEE Journal of Solid-State Circuits.
[156] Naveen Verma,et al. Realizing Low-Energy Classification Systems by Implementing Matrix Multiplication Directly Within an ADC , 2015, IEEE Transactions on Biomedical Circuits and Systems.
[157] Wonyong Sung,et al. Resiliency of Deep Neural Networks under Quantization , 2015, ArXiv.
[158] Christian Piguet,et al. A 1kb single-side read 6T sub-threshold SRAM in 180 nm with 530 Hz frequency 3.1 nA total current and 2.4 nA leakage at 0.27 V , 2015, 2015 IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S).
[159] Bernard Brezzo,et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[160] Marian Verhelst,et al. DVAS: Dynamic Voltage Accuracy Scaling for increased energy-efficiency in approximate computing , 2015, 2015 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).
[161] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[162] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[163] Jie Liu,et al. Scalable-effort classifiers for energy-efficient machine learning , 2015, DAC.
[164] Kaushik Roy,et al. Approximate computing and the quest for computing efficiency , 2015, 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC).
[165] Wonyong Sung,et al. Fixed point optimization of deep convolutional neural networks for object recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[166] Mark Y. Liu,et al. A 14nm logic technology featuring 2nd-generation FinFET, air-gapped interconnects, self-aligned double patterning and a 0.0588 µm2 SRAM cell size , 2014, 2014 IEEE International Electron Devices Meeting.
[167] Jason Cong,et al. Minimizing Computation in Convolutional Neural Networks , 2014, ICANN.
[168] Berin Martini,et al. A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[169] Zhen Wang,et al. Draining our glass: an energy and heat characterization of Google Glass , 2014, APSys.
[170] Mark Horowitz,et al. 1.1 Computing's energy problem (and what we can do about it) , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).
[171] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[172] Yann LeCun,et al. Fast Training of Convolutional Networks through FFTs , 2013, ICLR.
[173] Jean-Luc Nagel,et al. Ultra low-power standard cell design using planar bulk CMOS in subthreshold operation , 2013, 2013 23rd International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS).
[174] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[175] Masahide Matsumoto,et al. A 130.7mm2 2-layer 32Gb ReRAM memory device in 24nm technology , 2013, 2013 IEEE International Solid-State Circuits Conference Digest of Technical Papers.
[176] Marimuthu Palaniswami,et al. Internet of Things (IoT): A vision, architectural elements, and future directions , 2012, Future Gener. Comput. Syst..
[177] David Blaauw,et al. A Modular 1 mm$^{3}$ Die-Stacked Sensing Platform With Low Power I$^{2}$C Inter-Die Communication and Multi-Modal Energy Harvesting , 2013, IEEE Journal of Solid-State Circuits.
[178] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[179] C. Auth,et al. A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors , 2012, 2012 Symposium on VLSI Technology (VLSIT).
[180] J. Jeddeloh,et al. Hybrid memory cube new DRAM architecture increases density and performance , 2012, 2012 Symposium on VLSI Technology (VLSIT).
[181] Sander M. Bohte,et al. Computing with Spiking Neuron Networks , 2012, Handbook of Natural Computing.
[182] Stephen Berard,et al. Implications of Historical Trends in the Electrical Efficiency of Computing , 2011, IEEE Annals of the History of Computing.
[183] Hoi-Jun Yoo,et al. A 57mW embedded mixed-mode neuro-fuzzy accelerator for intelligent multi-core processor , 2011, 2011 IEEE International Solid-State Circuits Conference.
[184] Pradip Bose,et al. Power Wall , 2011, Encyclopedia of Parallel Computing.
[185] Ju-Wan Lee,et al. Comparison of SOI FinFETs and Bulk FinFETs , 2009 .
[186] Samuel Williams,et al. Roofline: an insightful visual performance model for multicore architectures , 2009, CACM.
[187] Pentti Kanerva,et al. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.
[188] Mark Bohr,et al. A 30 Year Retrospective on Dennard's MOSFET Scaling Paper , 2007, IEEE Solid-State Circuits Newsletter.
[189] Robert H. Dennard,et al. A 30 Year Retrospective on Dennard's MOSFET Scaling Paper , 2007 .
[190] A. Nordström,et al. What's next for WHO? , 2006, The Lancet.
[191] G. Moore. Cramming more components onto integrated circuits, Reprinted from Electronics, volume 38, number 8, April 19, 1965, pp.114 ff. , 2006, IEEE Solid-State Circuits Newsletter.
[192] Raymond Laflamme,et al. An Introduction to Quantum Computing , 2007, Quantum Inf. Comput..
[193] Sally A. McKee,et al. Reflections on the memory wall , 2004, CF '04.
[194] Alan F. Murray,et al. IEEE International Solid-State Circuits Conference , 2001 .
[195] Theo Ungerer,et al. Multiple-Issue Processors , 1999 .
[196] G.E. Moore,et al. Cramming More Components Onto Integrated Circuits , 1998, Proceedings of the IEEE.
[197] Piero Olivo,et al. Flash memory cells-an overview , 1997, Proc. IEEE.
[198] Thomas D. Burd,et al. Processor design for portable systems , 1996, J. VLSI Signal Process..
[199] Scott Shenker,et al. Scheduling for reduced CPU energy , 1994, OSDI '94.
[200] John von Neumann,et al. First draft of a report on the EDVAC , 1993, IEEE Annals of the History of Computing.
[201] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.