Offline Evaluation and Optimization for Interactive Systems
暂无分享,去创建一个
[1] Tapas Kanungo,et al. Model characterization curves for federated search using click-logs: predicting user engagement metrics for the span of feasible operating points , 2011, WWW.
[2] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[3] D. Rubin. Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .
[4] J. Heckman. Sample selection bias as a specification error , 1979 .
[5] Deepak Agarwal,et al. Personalized click shaping through lagrangian duality for online recommendation , 2012, SIGIR '12.
[6] Liang Zhang,et al. Activity ranking in LinkedIn feed , 2014, KDD.
[7] John Langford,et al. Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits , 2012, UAI.
[8] Doina Precup,et al. Eligibility Traces for Off-Policy Policy Evaluation , 2000, ICML.
[9] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[10] John Langford,et al. Exploration scavenging , 2008, ICML '08.
[11] Diane Lambert,et al. More bang for their bucks: assessing new features for online advertisers , 2007, SKDD.
[12] John Langford,et al. The offset tree for learning with partial labels , 2008, KDD.
[13] Katja Hofmann,et al. Fidelity, Soundness, and Efficiency of Interleaved Comparison Methods , 2013, TOIS.
[14] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.
[15] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics for Search Engines , 2014, ArXiv.
[16] Deepak Agarwal,et al. Click shaping to optimize multiple objectives , 2011, KDD.
[17] Neil D. Lawrence,et al. Dataset Shift in Machine Learning , 2009 .
[18] Olivier Nicol,et al. Improving offline evaluation of contextual bandit algorithms via bootstrapping techniques , 2014, ICML.
[19] John Langford,et al. Efficient Online Bootstrapping for Large Scale Learning , 2013, ArXiv.
[20] Lihong Li,et al. Evaluation of Explore-Exploit Policies in Multi-result Ranking Systems , 2015, ArXiv.
[21] Liang Tang,et al. Automatic ad format selection via contextual bandits , 2013, CIKM.
[22] J. Robins,et al. Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .
[23] Lihong Li,et al. Toward Predicting the Outcome of an A/B Experiment for Search Relevance , 2015, WSDM.
[24] Rong Ge,et al. Evaluating online ad campaigns in a pipeline: causal models at scale , 2010, KDD.
[25] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.
[26] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.
[27] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.
[28] Katja Hofmann,et al. Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.
[29] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[30] Wei Chu,et al. An Online Learning Framework for Refining Recency Search Results with User Click Feedback , 2012, TOIS.
[31] Stuart J. Russell,et al. Online bagging and boosting , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.