An Efficient Method of Training Small Models for Regression Problems with Knowledge Distillation
暂无分享,去创建一个
Hitoshi Imaoka | Yusuke Morishita | Makoto Takamoto | Hitoshi Imaoka | M. Takamoto | Yusuke Morishita
[1] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[2] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[3] Zhi-Hua Zhou,et al. Automatic Age Estimation Based on Facial Aging Patterns , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[4] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[5] Nassir Navab,et al. Robust Optimization for Deep Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[6] B. Ripley,et al. Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.
[7] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[8] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[9] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[10] Josef Kittler,et al. Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[12] Abhinav Gupta,et al. Learning from Noisy Large-Scale Datasets with Minimal Supervision , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[14] Mario Fritz,et al. MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[15] Yale Song,et al. Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[18] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[19] Takeo Kanade,et al. Multi-PIE , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.
[20] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[21] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[22] Xingrui Yu,et al. Co-teaching: Robust training of deep neural networks with extremely noisy labels , 2018, NeurIPS.
[23] Tony X. Han,et al. Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.
[24] Yaser Sheikh,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[25] Rich Caruana,et al. Model compression , 2006, KDD '06.
[26] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[27] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.