Inplace knowledge distillation with teacher assistant for improved training of flexible deep neural networks
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Thomas S. Huang,et al. Universally Slimmable Networks and Improved Training Techniques , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[3] Bohan Zhuang,et al. Switchable Precision Neural Networks , 2020, ArXiv.
[4] Seyed Iman Mirzadeh,et al. Improved Knowledge Distillation via Teacher Assistant , 2020, AAAI.
[5] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[6] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Distilled Hierarchical Neural Ensembles with Adaptive Inference Cost , 2020, ArXiv.
[8] Tao Zhang,et al. Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges , 2018, IEEE Signal Processing Magazine.
[9] Jakob Verbeek,et al. Adaptative Inference Cost With Convolutional Neural Mixture Models , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).
[10] Vincent Vanhoucke,et al. Improving the speed of neural networks on CPUs , 2011 .
[11] Kilian Q. Weinberger,et al. Multi-Scale Dense Networks for Resource Efficient Image Classification , 2017, ICLR.
[12] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[13] Chong Wang,et al. Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin , 2015, ICML.
[14] Enhua Wu,et al. Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Ning Xu,et al. Slimmable Neural Networks , 2018, ICLR.
[16] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[17] Chuang Gan,et al. Once for All: Train One Network and Specialize it for Efficient Deployment , 2019, ICLR.
[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[19] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[20] Joan Bruna,et al. Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation , 2014, NIPS.
[21] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.