Pushing the Envelope of Dynamic Spatial Gating technologies
暂无分享,去创建一个
Dibakar Gope | Jesse Beu | Urmish Thakker | Jesse G. Beu | Xueqin Huang | Urmish Thakker | Dibakar Gope | Xueqin Huang
[1] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[2] Hadi Esmaeilzadeh,et al. Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Network , 2017, 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA).
[3] Andrew Zisserman,et al. Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.
[4] Jin Tao,et al. Skipping RNN State Updates without Retraining the Original Model , 2019, SenSys-ML.
[5] Cheng-Zhong Xu,et al. Dynamic Channel Pruning: Feature Boosting and Suppression , 2018, ICLR.
[6] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[7] Matthew Mattina,et al. Compressing Language Models using Doped Kronecker Products , 2020, ArXiv.
[8] Matthew Mattina,et al. Pushing the limits of RNN Compression , 2019, ArXiv.
[9] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[10] Matthew Mattina,et al. Compressing RNNs for IoT devices by 15-38x using Kronecker Products , 2019, ArXiv.
[11] Matthew Mattina,et al. High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands , 2020, ArXiv.
[12] Zhiru Zhang,et al. Channel Gating Neural Networks , 2018, NeurIPS.
[13] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[14] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[15] David Patterson,et al. Benchmarking TinyML Systems: Challenges and Direction , 2020, ArXiv.
[16] Matthew Mattina,et al. Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications , 2019, MLSys.
[17] Xin Wang,et al. SkipNet: Learning Dynamic Routing in Convolutional Networks , 2017, ECCV.
[18] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.
[19] Matthew Mattina,et al. Ternary MobileNets via Per-Layer Hybrid Filter Banks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).
[20] Matthew Mattina,et al. Run-Time Efficient RNN Compression for Inference on Edge Devices , 2019, 2019 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2).
[21] Matthew Mattina,et al. Rank and run-time aware compression of NLP Applications , 2020, SUSTAINLP.
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Zhiru Zhang,et al. Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations , 2020, ICLR.
[24] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[25] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[26] Max Welling,et al. Batch-shaping for learning conditional channel gated networks , 2019, ICLR.
[27] Patrick Judd,et al. Stripes: Bit-serial deep neural network computing , 2016, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).