Memory Access Optimization for On-Chip Transfer Learning
暂无分享,去创建一个
[1] Massoud Pedram,et al. NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference , 2018, ArXiv.
[2] Hoi-Jun Yoo,et al. A Power-Efficient CNN Accelerator With Similar Feature Skipping for Face Recognition in Mobile Devices , 2020, IEEE Transactions on Circuits and Systems I: Regular Papers.
[3] Yanzhi Wang,et al. A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers , 2018, ECCV.
[4] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] John E. Stone,et al. An asymmetric distributed shared memory model for heterogeneous parallel systems , 2010, ASPLOS XV.
[6] Patrick Hansen,et al. FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning , 2019, ArXiv.
[7] Quoc V. Le,et al. Searching for MobileNetV3 , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[8] Jinjun Xiong,et al. ELFISH: Resource-Aware Federated Learning on Heterogeneous Edge Devices , 2019, ArXiv.
[9] Alessandro Aimar,et al. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps , 2017, IEEE Transactions on Neural Networks and Learning Systems.
[10] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[11] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[12] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[13] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[14] Tianqi Chen,et al. Training Deep Nets with Sublinear Memory Cost , 2016, ArXiv.
[15] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[16] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.
[17] Kin K. Leung,et al. When Edge Meets Learning: Adaptive Control for Resource-Constrained Distributed Machine Learning , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.
[18] Pangfeng Liu,et al. Data Pinning and Back Propagation Memory Optimization for Deep Learning on GPU , 2018, 2018 Sixth International Symposium on Computing and Networking (CANDAR).
[19] Hoi-Jun Yoo,et al. A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application , 2019, IEEE Transactions on Circuits and Systems I: Regular Papers.
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Eric P. Xing,et al. GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server , 2016, EuroSys.
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Fabrizio Lombardi,et al. High Performance CNN Accelerators Based on Hardware and Algorithm Co-Optimization , 2021, IEEE Transactions on Circuits and Systems I: Regular Papers.
[24] Yu Cao,et al. Automatic Compilation of Diverse CNNs Onto High-Performance FPGA Accelerators , 2020, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.
[25] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Jia Wang,et al. DaDianNao: A Machine-Learning Supercomputer , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[27] Sebastian Caldas,et al. Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.
[28] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[29] Hiram Ponce,et al. Deep Learning for Multimodal Fall Detection , 2019, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC).
[30] Yung-Hsiang Lu,et al. Cloud Computing for Mobile Users: Can Offloading Computation Save Energy? , 2010, Computer.
[31] Xian Zhou,et al. A Convolutional Neural Network Accelerator Architecture with Fine-Granular Mixed Precision Configurability , 2020, ISCAS.
[32] Li-Jia Li,et al. Multi-view Face Detection Using Deep Convolutional Neural Networks , 2015, ICMR.
[33] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[34] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[35] Stephan Günnemann,et al. Introduction to Tensor Decompositions and their Applications in Machine Learning , 2017, ArXiv.
[36] Xiaowei Li,et al. FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks , 2017, 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA).
[37] Hyuk-Jae Lee,et al. A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection , 2019, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[38] George K. Thiruvathukal,et al. Low-Power Computer Vision: Status, Challenges, and Opportunities , 2019, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.
[39] Sek Chai,et al. Bit Efficient Quantization for Deep Neural Networks , 2019, ArXiv.
[40] Vivienne Sze,et al. Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[41] Qiaosha Zou,et al. A Communication-Aware DNN Accelerator on ImageNet Using In-Memory Entry-Counting Based Algorithm-Circuit-Architecture Co-Design in 65-nm CMOS , 2020, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.