Knowledge Distillation via Route Constrained Optimization
暂无分享,去创建一个
Xiaolin Hu | Yu Liu | Xiao Jin | Junjie Yan | Yichao Wu | Ding Liang | Baoyun Peng | Jiaheng Liu | Junjie Yan | Yichao Wu | Xiaolin Hu | Yu Liu | Jiaheng Liu | Ding Liang | Xiao Jin | Baoyun Peng
[1] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.
[2] Matthew Richardson,et al. Do Deep Convolutional Nets Really Need to be Deep and Convolutional? , 2016, ICLR.
[3] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[4] Timo Aila,et al. Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.
[5] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Yuxiao Hu,et al. MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World , 2016, IMAWM.
[7] Song Han,et al. AMC: AutoML for Model Compression and Acceleration on Mobile Devices , 2018, ECCV.
[8] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[9] Zhiqiang Shen,et al. Learning Efficient Convolutional Networks through Network Slimming , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[10] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[11] Huchuan Lu,et al. Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[14] Guorui Zhou,et al. Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net , 2017, AAAI.
[15] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[16] Stefanos Zafeiriou,et al. ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[17] Yale Song,et al. Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[18] Fei Wang,et al. The Devil of Face Recognition is in the Noise , 2018, ECCV.
[19] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[20] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[21] Junjie Yan,et al. Mimicking Very Efficient Network for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[25] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[26] Hassan Ghasemzadeh,et al. Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher , 2019, ArXiv.
[27] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[28] Xiaogang Wang,et al. Face Model Compression by Distilling Knowledge from Neurons , 2016, AAAI.
[29] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[30] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[31] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[32] Rich Caruana,et al. Model compression , 2006, KDD '06.
[33] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[34] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[35] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.