DropMax: Adaptive Variational Softmax
暂无分享,去创建一个
Eunho Yang | Juho Lee | Saehoon Kim | Sung Ju Hwang | Haebeom Lee | Eunho Yang | Juho Lee | Saehoon Kim | Haebeom Lee
[1] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[2] Pascal Vincent,et al. Dropout as data augmentation , 2015, ArXiv.
[3] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[4] Ariel D. Procaccia,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.
[5] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[6] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.
[7] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[8] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .
[9] Quoc V. Le,et al. Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.
[10] Pascal Vincent,et al. An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family , 2015, ICLR.
[11] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[12] Honglak Lee,et al. Learning Structured Output Representation using Deep Conditional Generative Models , 2015, NIPS.
[13] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[14] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[15] Alex Kendall,et al. Concrete Dropout , 2017, NIPS.
[16] André Elisseeff,et al. Stability and Generalization , 2002, J. Mach. Learn. Res..
[17] Zoubin Ghahramani,et al. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.
[18] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[19] Brendan J. Frey,et al. Adaptive dropout for training deep neural networks , 2013, NIPS.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[23] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.
[24] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.
[25] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[26] Zoubin Ghahramani,et al. Bayesian Convolutional Neural Networks with Bernoulli Approximate Variational Inference , 2015, ArXiv.
[27] Christoph H. Lampert,et al. Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.
[28] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[29] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.
[30] Alex Kendall,et al. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.
[31] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.