暂无分享,去创建一个
Guy Jacob | Gal Novik | Neta Zmora | Lev Zlotnik | Bar Elharar | Gal Novik | Neta Zmora | Guy Jacob | Lev Zlotnik | Bar Elharar
[1] Sile Wang,et al. Thinning of convolutional neural network with mixed pruning , 2019, IET Image Process..
[2] William J. Dally,et al. Analog/Mixed-Signal Hardware Error Modeling for Deep Learning Inference , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).
[3] Yurong Chen,et al. Dynamic Network Surgery for Efficient DNNs , 2016, NIPS.
[4] Erich Elsen,et al. Exploring Sparsity in Recurrent Neural Networks , 2017, ICLR.
[5] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[6] Hadi Esmaeilzadeh,et al. SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training , 2019 .
[7] James Glass,et al. FAKTA: An Automatic End-to-End Fact Checking System , 2019, NAACL.
[8] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.
[9] Yifan Gong,et al. Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.
[10] Uri Weiser,et al. SMT-SA: Simultaneous Multithreading in Systolic Arrays , 2019, IEEE Computer Architecture Letters.
[11] H. T. Kung,et al. BranchyNet: Fast inference via early exiting from deep neural networks , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).
[12] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[13] Zheng Zhang,et al. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.
[14] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[15] Tinoosh Mohsenin,et al. Accelerating Convolutional Neural Network With FFT on Embedded Hardware , 2018, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[16] Tat-Seng Chua,et al. Neural Collaborative Filtering , 2017, WWW.
[17] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[18] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[19] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[20] Cheng Deng,et al. Cross Domain Model Compression by Structurally Weight Sharing , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Song Han,et al. EIE: Efficient Inference Engine on Compressed Deep Neural Network , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).
[22] Daniel Soudry,et al. Post training 4-bit quantization of convolutional networks for rapid-deployment , 2018, NeurIPS.
[23] Bin Yu,et al. Structural Compression of Convolutional Neural Networks Based on Greedy Filter Pruning , 2017, ArXiv.
[24] Amirsina Torfi,et al. Attention-Based Guided Structured Sparsity of Deep Neural Networks , 2018, ArXiv.
[25] Sergio Guadarrama,et al. Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Tim Kraska,et al. Smallify: Learning Network Size while Training , 2018, ArXiv.
[27] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[28] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[29] Bertrand A. Maher,et al. Glow: Graph Lowering Compiler Techniques for Neural Networks , 2018, ArXiv.
[30] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[31] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[32] Zhiru Zhang,et al. Improving Neural Network Quantization without Retraining using Outlier Channel Splitting , 2019, ICML.
[33] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[34] Eriko Nurvitadhi,et al. WRPN: Wide Reduced-Precision Networks , 2017, ICLR.
[35] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[36] Wei Liu,et al. PocketFlow: An Automated Framework for Compressing and Accelerating Deep Neural Networks , 2018 .
[37] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[38] Swagath Venkataramani,et al. PACT: Parameterized Clipping Activation for Quantized Neural Networks , 2018, ArXiv.
[39] Raghuraman Krishnamoorthi,et al. Quantizing deep convolutional networks for efficient inference: A whitepaper , 2018, ArXiv.
[40] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.
[41] Suyog Gupta,et al. To prune, or not to prune: exploring the efficacy of pruning for model compression , 2017, ICLR.
[42] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[43] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.