Optimizing Data Usage via Differentiable Rewards
暂无分享,去创建一个
Xinyi Wang | J. Carbonell | Graham Neubig | Antonios Anastasopoulos | Paul Michel | Xinyi Wang | Hieu Pham
[1] Rishabh K. Iyer,et al. Learning Mixtures of Submodular Functions for Image Collection Summarization , 2014, NIPS.
[2] William D. Lewis,et al. Intelligent Selection of Language Model Training Data , 2010, ACL.
[3] Tao Qin,et al. Learning What Data to Learn , 2017, ArXiv.
[4] Rico Sennrich,et al. Revisiting Low-Resource Neural Machine Translation: A Case Study , 2019, ACL.
[5] Marine Carpuat,et al. Identifying Semantic Divergences in Parallel Text without Annotations , 2018, NAACL.
[6] Nikos Komodakis,et al. Wide Residual Networks , 2016, BMVC.
[7] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.
[8] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[9] Graham Neubig,et al. Rapid Adaptation of Neural Machine Translation to New Languages , 2018, EMNLP.
[10] Shiguang Shan,et al. Self-Paced Curriculum Learning , 2015, AAAI.
[11] H. Shimodaira,et al. Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .
[12] Yulia Tsvetkov,et al. Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning , 2016, ACL.
[13] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[14] Li Fei-Fei,et al. MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels , 2017, ICML.
[15] Alex Graves,et al. Automated Curriculum Learning for Neural Networks , 2017, ICML.
[16] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[17] Jianfeng Gao,et al. Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.
[18] Valentin I. Spitkovsky,et al. From Baby Steps to Leapfrog: How “Less is More” in Unsupervised Dependency Parsing , 2010, NAACL.
[19] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[21] Mark W. Schmidt,et al. Online Learning Rate Adaptation with Hypergradient Descent , 2017, ICLR.
[22] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[23] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[24] Jan Cernocký,et al. Probabilistic and Bottle-Neck Features for LVCSR of Meetings , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.
[25] Terry L. Friesz,et al. Hierarchical optimization: An introduction , 1992, Ann. Oper. Res..
[26] Patrice Marcotte,et al. An overview of bilevel optimization , 2007, Ann. Oper. Res..
[27] Emmanuel Vincent,et al. Discriminative importance weighting of augmented training data for acoustic model training , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[28] Lei Li,et al. Reinforced Co-Training , 2018, NAACL.
[29] J. Stenton,et al. Learning how to teach. , 1973, Nursing mirror and midwives journal.
[30] Barnabás Póczos,et al. Competence-based Curriculum Learning for Neural Machine Translation , 2019, NAACL.
[31] Alon Lavie,et al. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.
[32] Jeff A. Bilmes,et al. Submodularity for Data Selection in Machine Translation , 2014, EMNLP.
[33] Graham Neubig,et al. When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? , 2018, NAACL.
[34] Andrew J. Davison,et al. Self-Supervised Generalisation with Meta Auxiliary Learning , 2019, NeurIPS.
[35] Alexander J. Smola,et al. Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.
[36] Yong Jae Lee,et al. Learning the easy things first: Self-paced visual category discovery , 2011, CVPR 2011.
[37] Quoc V. Le,et al. Do Better ImageNet Models Transfer Better? , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[39] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[40] Daphne Koller,et al. Self-Paced Learning for Latent Variable Models , 2010, NIPS.
[41] Graham Neubig,et al. Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation , 2019, ACL.
[42] Quoc V. Le,et al. Domain Adaptive Transfer Learning with Specialist Models , 2018, ArXiv.
[43] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[44] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[45] Jungi Kim,et al. Boosting Neural Machine Translation , 2016, IJCNLP.
[46] Huda Khayrallah,et al. An Empirical Exploration of Curriculum Learning for Neural Machine Translation , 2018, ArXiv.
[47] ChengXiang Zhai,et al. Instance Weighting for Domain Adaptation in NLP , 2007, ACL.
[48] Graham Neubig,et al. Multilingual Neural Machine Translation With Soft Decoupled Encoding , 2019, ICLR.
[49] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[50] Bin Yang,et al. Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.
[51] Lemao Liu,et al. Instance Weighting for Neural Machine Translation Domain Adaptation , 2017, EMNLP.
[52] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[53] Yuan Li,et al. Learning how to Active Learn: A Deep Reinforcement Learning Approach , 2017, EMNLP.
[54] George F. Foster,et al. Reinforcement Learning based Curriculum Optimization for Neural Machine Translation , 2019, NAACL.
[55] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[56] François Yvon,et al. Fixing Translation Divergences in Parallel Corpora for Neural MT , 2018, EMNLP.
[57] Ciprian Chelba,et al. Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation , 2019, ACL.
[58] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.
[59] Christof Monz,et al. Dynamic Data Selection for Neural Machine Translation , 2017, EMNLP.
[60] Razvan Pascanu,et al. Adapting Auxiliary Losses Using Gradient Similarity , 2018, ArXiv.
[61] Zhuowen Tu,et al. Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[62] Roland Kuhn,et al. Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation , 2010, EMNLP.
[63] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.