Feature-Level Ensemble Knowledge Distillation for Aggregating Knowledge from Multiple Networks
暂无分享,去创建一个
[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[2] Ying Cui,et al. Convex Principal Feature Selection , 2010, SDM.
[3] Rich Caruana,et al. Do Deep Nets Really Need to be Deep? , 2013, NIPS.
[4] Huan Liu,et al. Reconstruction-based Unsupervised Feature Selection: An Embedded Approach , 2017, IJCAI.
[5] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[6] Quoc V. Le,et al. Efficient Neural Architecture Search via Parameter Sharing , 2018, ICML.
[7] Junmo Kim,et al. A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Jangho Kim,et al. Paraphrasing Complex Network: Network Compression via Factor Transfer , 2018, NeurIPS.
[9] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Kaiming He,et al. Data Distillation: Towards Omni-Supervised Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[11] Huchuan Lu,et al. Deep Mutual Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[13] Rich Caruana,et al. Model compression , 2006, KDD '06.
[14] Li Sun,et al. Amalgamating Knowledge towards Comprehensive Classification , 2018, AAAI.
[15] Harri Valpola,et al. Weight-averaged consistency targets improve semi-supervised deep learning results , 2017, ArXiv.
[16] Geoffrey E. Hinton,et al. Distilling a Neural Network Into a Soft Decision Tree , 2017, CEx@AI*IA.
[17] Naiyan Wang,et al. Like What You Like: Knowledge Distill via Neuron Selectivity Transfer , 2017, ArXiv.
[18] Mohamed S. Kamel,et al. An Efficient Greedy Method for Unsupervised Feature Selection , 2011, 2011 IEEE 11th International Conference on Data Mining.
[19] Albert Gordo,et al. Learning Global Additive Explanations for Neural Nets Using Model Distillation , 2018 .
[20] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[21] Matthew Richardson,et al. Blending LSTMs into CNNs , 2015, ICLR 2016.
[22] Dacheng Tao,et al. Learning from Multiple Teacher Networks , 2017, KDD.
[23] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[24] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[25] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[26] Christos Boutsidis,et al. Unsupervised feature selection for principal components analysis , 2008, KDD.
[27] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.
[28] Xu Lan,et al. Knowledge Distillation by On-the-Fly Native Ensemble , 2018, NeurIPS.