暂无分享,去创建一个
Yi Su | Akshay Krishnamurthy | Miroslav Dudík | Maria Dimakopoulou | A. Krishnamurthy | Miroslav Dudík | Yi-Hsun Su | Maria Dimakopoulou
[1] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[2] Nicolò Cesa-Bianchi,et al. Combinatorial Bandits , 2012, COLT.
[3] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[4] Tao Qin,et al. Introducing LETOR 4.0 Datasets , 2013, ArXiv.
[5] Marie Frei,et al. Decoupling From Dependence To Independence , 2016 .
[6] C. Rothe. The Value of Knowing the Propensity Score for Estimating Average Treatment Effects , 2016, SSRN Electronic Journal.
[7] J. Robins,et al. Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.
[8] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[9] J. Robins,et al. Semiparametric Efficiency in Multivariate Regression Models with Missing Data , 1995 .
[10] Baruch Awerbuch,et al. Online linear optimization and adaptive routing , 2008, J. Comput. Syst. Sci..
[11] John Langford,et al. Doubly Robust Policy Evaluation and Optimization , 2014, ArXiv.
[12] Nathan Kallus,et al. A Framework for Optimal Matching for Causal Inference , 2016, AISTATS.
[13] G. Imbens,et al. Mean-Squared-Error Calculations for Average Treatment Effects , 2005 .
[14] Mehrdad Farajtabar,et al. More Robust Doubly Robust Off-policy Evaluation , 2018, ICML.
[15] G. Imbens,et al. Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2002 .
[16] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[17] Thorsten Joachims,et al. The Self-Normalized Estimator for Counterfactual Learning , 2015, NIPS.
[18] Stefan Wager,et al. Efficient Policy Learning , 2017, ArXiv.
[19] Thomas P. Hayes,et al. The Price of Bandit Information for Online Optimization , 2007, NIPS.
[20] Mark J. van der Laan,et al. Data-adaptive selection of the truncation level for Inverse-Probability-of-Treatment-Weighted estimators , 2008 .
[21] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study , 2015, WWW.
[22] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[23] J. Hahn. On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects , 1998 .
[24] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[25] Yi Su. Doubly robust off-policy evaluation with shrinkage , 2019 .
[26] G. Imbens,et al. Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score , 2000 .
[27] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[28] Yi Su,et al. CAB: Continuous Adaptive Blending Estimator for Policy Evaluation and Learning , 2018, ArXiv.
[29] Michael R Kosorok,et al. Residual Weighted Learning for Estimating Individualized Treatment Rules , 2015, Journal of the American Statistical Association.
[30] Joseph Kang,et al. Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data , 2007, 0804.2958.
[31] Thorsten Joachims,et al. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.
[32] D. Horvitz,et al. A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .
[33] John Langford,et al. Off-policy evaluation for slate recommendation , 2016, NIPS.
[34] Miroslav Dudík,et al. Optimal and Adaptive Off-policy Evaluation in Contextual Bandits , 2016, ICML.
[35] Marie Davidian,et al. Comment: Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. , 2008, Statistical science : a review journal of the Institute of Mathematical Statistics.