暂无分享,去创建一个
Ed H. Chi | Zhe Zhao | Anima Singh | Sagar Jain | Dong Lin | Rakesh Shivanna | Jiaxi Tang | Sagar Jain | Rakesh Shivanna | Zhe Zhao | Jiaxi Tang | Dong Lin | Anima Singh
[1] Zachary Chase Lipton,et al. Born Again Neural Networks , 2018, ICML.
[2] Nicolas Le Roux. Tighter bounds lead to improved classifiers , 2017, ICLR.
[3] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[4] Rauf Izmailov,et al. Learning using privileged information: similarity control and knowledge transfer , 2015, J. Mach. Learn. Res..
[5] Jin Young Choi,et al. Knowledge Distillation with Adversarial Samples Supporting Decision Boundary , 2018, AAAI.
[6] Zhaoxiang Zhang,et al. DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer , 2017, AAAI.
[7] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[8] Geoffrey E. Hinton,et al. When Does Label Smoothing Help? , 2019, NeurIPS.
[9] Jiashi Feng,et al. Revisit Knowledge Distillation: a Teacher-free Framework , 2019, ArXiv.
[10] Alexander M. Rush,et al. Sequence-Level Knowledge Distillation , 2016, EMNLP.
[11] Song Han,et al. Learning both Weights and Connections for Efficient Neural Network , 2015, NIPS.
[12] Bernhard Schölkopf,et al. Unifying distillation and privileged information , 2015, ICLR.
[13] Kan Chen,et al. Billion-scale semi-supervised learning for image classification , 2019, ArXiv.
[14] Zhi Zhang,et al. Bag of Tricks for Image Classification with Convolutional Neural Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[15] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[16] Minjae Lee,et al. SVD-Softmax: Fast Softmax Approximation on Large Vocabulary Neural Networks , 2017, NIPS.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[19] Geoffrey E. Hinton,et al. Large scale distributed neural network training through online distillation , 2018, ICLR.
[20] Úlfar Erlingsson,et al. Scalable Private Learning with PATE , 2018, ICLR.
[21] Ke Wang,et al. Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System , 2018, KDD.
[22] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[23] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[24] Wei Zhang,et al. Heated-Up Softmax Embedding , 2018, ArXiv.
[25] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[26] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[27] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[28] Hassan Ghasemzadeh,et al. Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher , 2019, ArXiv.
[29] Zhe Zhao,et al. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts , 2018, KDD.
[30] Razvan Pascanu,et al. Sobolev Training for Neural Networks , 2017, NIPS.
[31] François Fleuret,et al. Knowledge Transfer with Jacobian Matching , 2018, ICML.
[32] Ananthram Swami,et al. Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).
[33] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] Seyed Iman Mirzadeh,et al. Improved Knowledge Distillation via Teacher Assistant , 2020, AAAI.
[35] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[36] Carlos D. Castillo,et al. L2-constrained Softmax Loss for Discriminative Face Verification , 2017, ArXiv.
[37] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[38] Christoph H. Lampert,et al. Towards Understanding Knowledge Distillation , 2019, ICML.