Revisiting Dropout: Escaping Pressure for Training Neural Networks with Multiple Costs
暂无分享,去创建一个
Seung-Hoon Na | Junhyug Noh | Sangmin Woo | Kangil Kim | Shin Jong Hun | Seung-Hoon Na | Kangil Kim | Junhyug Noh | Sangmin Woo
[1] Leonidas J. Guibas,et al. Taskonomy: Disentangling Task Transfer Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[2] Satoshi Nakamura,et al. Neural Machine Translation via Binary Code Prediction , 2017, ACL.
[3] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[4] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Iasonas Kokkinos,et al. UberNet: Training a Universal Convolutional Neural Network for Low-, Mid-, and High-Level Vision Using Diverse Datasets and Limited Memory , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[6] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.
[7] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[8] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.
[9] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[10] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[11] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[12] Vladlen Koltun,et al. Multi-Task Learning as Multi-Objective Optimization , 2018, NeurIPS.
[13] Marcus Rohrbach,et al. 12-in-1: Multi-Task Vision and Language Representation Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[15] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[16] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[17] Jonathan Baxter,et al. A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..
[18] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[19] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.
[20] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[21] Zhiguo Wang,et al. Coverage Embedding Models for Neural Machine Translation , 2016, EMNLP.
[22] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[23] Roberto Cipolla,et al. Multi-task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[24] Quoc V. Le,et al. BAM! Born-Again Multi-Task Networks for Natural Language Understanding , 2019, ACL.
[25] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[26] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[27] Pierre Baldi,et al. Understanding Dropout , 2013, NIPS.
[28] Hynek Hermansky,et al. Sparse Multilayer Perceptron for Phoneme Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[29] Jörg Fliege,et al. Steepest descent methods for multicriteria optimization , 2000, Math. Methods Oper. Res..
[30] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[31] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[32] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[33] Chris Callison-Burch,et al. Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .
[34] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.