Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices
暂无分享,去创建一个
Bahar Asgari | Ramyad Hadidi | Hyesoon Kim | Tushar Krishna | Yilun Xie | Jiashen Cao | Hyesoon Kim | Ramyad Hadidi | T. Krishna | Jiashen Cao | Yilun Xie | Bahar Asgari
[1] Sudhakar Yalamanchili,et al. LODESTAR: Creating Locally-Dense CNNs for Efficient Inference on Systolic Arrays* , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[2] Францкевич Кирилл Эдуардович,et al. ИССЛЕДОВАНИЕ КЛАСТЕРНОЙ СИСТЕМЫ НА ОСНОВЕ ОДНОПЛАТНЫХ КОМПЬЮТЕРОВ RASPBERRY PI 3B , 2019 .
[3] Massimo Banzi,et al. Make: Getting Started with Arduino: The Open Source Electronics Prototyping Platform , 2014 .
[4] Tianshi Chen,et al. ShiDianNao: Shifting vision processing closer to the sensor , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).
[5] Jason Cong,et al. Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks , 2015, FPGA.
[6] Erdogan Dogdu,et al. Context-Aware Computing, Learning, and Big Data in Internet of Things: A Survey , 2018, IEEE Internet of Things Journal.
[7] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[8] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[9] Yu Wang,et al. Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.
[10] Yifan Wang,et al. pCAMP: Performance Comparison of Machine Learning Packages on the Edges , 2019, HotEdge.
[11] Michael S. Ryoo,et al. Collaborative Execution of Deep Neural Networks on Internet of Things Devices , 2019, ArXiv.
[12] Bahar Asgari,et al. Capella: Customizing Perception for Edge Devices by Efficiently Allocating FPGAs to DNNs , 2019, 2019 29th International Conference on Field Programmable Logic and Applications (FPL).
[13] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[14] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[15] Ashutosh Kumar Singh,et al. Machine Learning for High-Throughput Stress Phenotyping in Plants. , 2016, Trends in plant science.
[16] Sarmad Ullah Khan,et al. Future Internet: The Internet of Things Architecture, Possible Applications and Key Challenges , 2012, 2012 10th International Conference on Frontiers of Information Technology.
[17] In Lee,et al. The Internet of Things (IoT): Applications, investments, and challenges for enterprises , 2015 .
[18] Ramyad Hadidi,et al. An Edge-Centric Scalable Intelligent Framework To Collaboratively Execute DNN , 2019 .
[19] Crefeda Faviola Rodrigues,et al. SyNERGY: An energy measurement and prediction framework for Convolutional Neural Networks on Jetson TX1 , 2018 .
[20] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[22] Sudhakar Yalamanchili,et al. Optimizing Data Warehousing Applications for GPUs Using Kernel Fusion/Fission , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.
[23] Ninghui Sun,et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning , 2014, ASPLOS.
[24] Lin Zhong,et al. RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[25] Andrew Zisserman,et al. Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.
[26] Pritish Narayanan,et al. Deep Learning with Limited Numerical Precision , 2015, ICML.
[27] Ming Yang,et al. Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.
[28] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[29] Alex Krizhevsky,et al. One weird trick for parallelizing convolutional neural networks , 2014, ArXiv.
[30] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[31] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[32] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[33] Scott A. Mahlke,et al. Scalpel: Customizing DNN pruning to the underlying hardware parallelism , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[34] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[35] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[36] Wonyong Sung,et al. Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..
[37] Trevor N. Mudge,et al. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge , 2017, ASPLOS.
[38] Michael S. Ryoo,et al. Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning , 2017, AAAI.
[39] Sudhakar Yalamanchili,et al. ERIDANUS: Efficiently Running Inference of DNNs Using Systolic Arrays , 2019, IEEE Micro.
[40] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[41] Roland Siegwart,et al. From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[42] Lida Xu,et al. The internet of things: a survey , 2014, Information Systems Frontiers.
[43] Dexmont Peña,et al. Benchmarking of CNNs for Low-Cost , Low-Power Robotics Applications , 2010 .
[44] Yu Cao,et al. Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks , 2016, FPGA.
[45] Yoshua Bengio,et al. Training deep neural networks with low precision multiplications , 2014 .
[46] Michael S. Ryoo,et al. Musical Chair: Efficient Real-Time Recognition Using Collaborative IoT Devices , 2018, ArXiv.
[47] Michael S. Ryoo,et al. Real-Time Image Recognition Using Collaborative IoT Devices , 2018, ReQuEST@ASPLOS.
[48] Yiran Chen,et al. MoDNN: Local distributed mobile computing system for Deep Neural Network , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[49] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[50] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[51] Huimin Lu,et al. Motor Anomaly Detection for Unmanned Aerial Vehicles Using Reinforcement Learning , 2018, IEEE Internet of Things Journal.
[52] Sergey Ioffe,et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.
[53] Ramyad Hadidi,et al. Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices , 2019, PEARC.
[54] Philip Heng Wai Leong,et al. FINN: A Framework for Fast, Scalable Binarized Neural Network Inference , 2016, FPGA.
[55] Gu-Yeon Wei,et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[56] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[57] Song Han,et al. ESE: Efficient Speech Recognition Engine with Sparse LSTM on FPGA , 2016, FPGA.
[58] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[59] 김종영. 구글 TensorFlow 소개 , 2015 .
[60] Louis B. Rall,et al. Automatic Differentiation: Techniques and Applications , 1981, Lecture Notes in Computer Science.
[61] Lina Yao,et al. Deep Learning Based Recommender System , 2017, ACM Comput. Surv..
[62] Xin Wang,et al. Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks , 2017, NIPS.
[63] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[64] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[65] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.
[66] Lorenzo Torresani,et al. Learning Spatiotemporal Features with 3D Convolutional Networks , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).
[67] Paolo Napoletano,et al. Benchmark Analysis of Representative Deep Neural Network Architectures , 2018, IEEE Access.
[68] Joel Emer,et al. Eyeriss: an Energy-efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks Accessed Terms of Use , 2022 .
[69] Ali Farhadi,et al. YOLOv3: An Incremental Improvement , 2018, ArXiv.
[70] Han Jie,et al. 基于NVIDIA Jetson TX2的道路场景分割 , 2018 .
[71] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[72] Jiwen Lu,et al. Runtime Neural Pruning , 2017, NIPS.
[73] David A. Patterson,et al. In-datacenter performance analysis of a tensor processing unit , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[74] Kilian Q. Weinberger,et al. CondenseNet: An Efficient DenseNet Using Learned Group Convolutions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[75] Matti Siekkinen,et al. Latency and throughput characterization of convolutional neural networks for mobile computer vision , 2018, MMSys.
[76] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[77] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[78] Asit K. Mishra,et al. From high-level deep neural models to FPGAs , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).
[79] Mohan M. Trivedi,et al. Looking at Humans in the Age of Self-Driving and Highly Automated Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.
[80] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[81] Michael S. Ryoo,et al. Distributed Perception by Collaborative Robots , 2018, IEEE Robotics and Automation Letters.
[82] Matthew L. Merck,et al. Understanding the Power Consumption of Executing Deep Neural Networks on a Distributed Robot System , 2019 .
[83] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.