暂无分享,去创建一个
[1] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[2] Stefan Riezler,et al. Counterfactual Learning from Bandit Feedback under Deterministic Logging : A Case Study in Statistical Machine Translation , 2017, EMNLP.
[3] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[4] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[5] Ming-Wei Chang,et al. Maximum Margin Reward Networks for Learning from Explicit and Implicit Supervision , 2017, EMNLP.
[6] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[7] Ming-Wei Chang,et al. The Value of Semantic Parse Labeling for Knowledge Base Question Answering , 2016, ACL.
[8] Mirella Lapata,et al. Language to Logical Form with Neural Attention , 2016, ACL.
[9] Thorsten Joachims,et al. Batch learning from logged bandit feedback through counterfactual risk minimization , 2015, J. Mach. Learn. Res..
[10] Dan Roth,et al. Learning from natural instructions , 2011, Machine Learning.
[11] Stefan Riezler,et al. NLmaps: A Natural Language Interface to Query OpenStreetMap , 2016, COLING.
[12] Hang Li,et al. Coupling Distributed and Symbolic Execution for Natural Language Queries , 2016, ICML.
[13] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[14] Stefan Riezler,et al. A Corpus and Semantic Parser for Multilingual Natural Language Querying of OpenStreetMap , 2016, NAACL.
[15] Rico Sennrich,et al. Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.
[16] Thorsten Joachims,et al. The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.
[17] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[18] Stefan Riezler,et al. Counterfactual Learning for Machine Translation: Degeneracies and Solutions , 2017, ArXiv.
[19] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[20] M. de Rijke,et al. Deep Learning with Logged Bandit Feedback , 2018, ICLR.
[21] Chen Liang,et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision , 2016, ACL.
[22] K. J. Evans,et al. Computer Intensive Methods for Testing Hypotheses: An Introduction , 1990 .
[23] D. Rubin,et al. The central role of the propensity score in observational studies for causal effects , 1983 .
[24] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[25] Martín Abadi,et al. Learning a Natural Language Interface with Neural Programmer , 2016, ICLR.
[26] Alvin Cheung,et al. Learning a Neural Semantic Parser from User Feedback , 2017, ACL.
[27] P. Green. On Use of the EM Algorithm for Penalized Likelihood Estimation , 1990 .
[28] A. Preliminaries. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016 .
[29] Luke S. Zettlemoyer,et al. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions , 2013, TACL.
[30] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[31] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[32] Percy Liang,et al. Data Recombination for Neural Semantic Parsing , 2016, ACL.