暂无分享,去创建一个
[1] Xinkun Nie,et al. Why adaptively collected data have negative bias and how to correct for it , 2017, AISTATS.
[2] Eric M. Schwartz,et al. Dynamic Online Pricing with Incomplete Information Using Multi-Armed Bandit Experiments , 2018, Mark. Sci..
[3] Austin Daniel,et al. Reserve Price Optimization at Scale , 2016 .
[4] Z. Dienes. How Bayes factors change scientific practice , 2016 .
[5] Vasilis Syrgkanis,et al. Accurate Inference for Adaptive Linear Models , 2017, ICML.
[6] Lalit Jain,et al. A Bandit Approach to Sequential Experimental Design with False Discovery Control , 2018, NeurIPS.
[7] L. J. Savage,et al. The Foundations of Statistics , 1955 .
[8] Luke Bornn,et al. Sequential Monte Carlo Bandits , 2013, ArXiv.
[9] Rémi Munos,et al. Pure Exploration in Multi-armed Bandits Problems , 2009, ALT.
[10] Liangjie Hong,et al. A Sequential Test for Selecting the Better Variant: Online A/B testing, Adaptive Allocation, and Continuous Monitoring , 2019, WSDM.
[11] Philip S. Thomas,et al. Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning , 2016, ICML.
[12] Narayanan Sadagopan,et al. Contextual Multi-Armed Bandits for Causal Marketing , 2018, ArXiv.
[13] Elias Bareinboim,et al. Bandits with Unobserved Confounders: A Causal Approach , 2015, NIPS.
[14] Edoardo M. Airoldi,et al. Optimizing Cluster-based Randomized Experiments under Monotonicity , 2018, KDD.
[15] Nathan Kallus,et al. Instrument-Armed Bandits , 2017, ALT.
[16] Susan Athey,et al. Estimation Considerations in Contextual Bandits , 2017, ArXiv.
[17] H. Varian. Online Ad Auctions , 2009 .
[18] Yu Wang,et al. LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions , 2017, ArXiv.
[19] Jie W Weiss,et al. Bayesian Statistical Inference for Psychological Research , 2008 .
[20] D. Lindley. A STATISTICAL PARADOX , 1957 .
[21] Jeffrey Wong,et al. Incrementality Bidding & Attribution , 2018 .
[22] Jun Wang,et al. Real-time bidding for online advertising: measurement and analysis , 2013, ADKDD '13.
[23] Rick P. Thomas,et al. When decision heuristics and science collide , 2013, Psychonomic Bulletin & Review.
[24] T. Amemiya. Tobit models: A survey , 1984 .
[25] Dale J. Poirier,et al. Learning about the across-regime correlation in switching regression models , 1997 .
[26] Benjamin Van Roy,et al. A Tutorial on Thompson Sampling , 2017, Found. Trends Mach. Learn..
[27] Shuchi Chawla,et al. A/B Testing of Auctions , 2016, EC.
[28] Nathan Kallus,et al. Balanced Policy Evaluation and Learning , 2017, NeurIPS.
[29] Maximilian Kasy,et al. Adaptive Treatment Assignment in Experiments for Policy Choice , 2019, Econometrica.
[30] Carl F. Mela,et al. Online Display Advertising Markets: A Literature Review and Future Directions , 2019, Inf. Syst. Res..
[31] Gary Koop,et al. Bayesian Econometric Methods , 2007 .
[32] Thorsten Joachims,et al. Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.
[33] J. Rouder. Optional stopping: No problem for Bayesians , 2014, Psychonomic bulletin & review.
[34] Di Wu,et al. A Multi-Agent Reinforcement Learning Method for Impression Allocation in Online Display Advertising , 2018, ArXiv.
[35] Peter Grünwald,et al. Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations , 2018, ArXiv.
[36] Stefan Wager,et al. Policy Learning With Observational Data , 2017, Econometrica.
[37] Illtyd Trethowan. Causality , 1938 .
[38] Lihong Li,et al. Learning from Logged Implicit Exploration Data , 2010, NIPS.
[39] Jack Bowden,et al. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges. , 2015, Statistical science : a review journal of the Institute of Mathematical Statistics.
[40] Tor Lattimore,et al. Causal Bandits: Learning Good Interventions via Causal Inference , 2016, NIPS.
[41] Peter E. Rossi,et al. Bayesian Statistics and Marketing , 2005 .
[42] John Langford,et al. Doubly Robust Policy Evaluation and Learning , 2011, ICML.
[43] Steven L. Scott,et al. A modern Bayesian look at the multi-armed bandit , 2010 .
[44] Alp Akcay,et al. Optimizing reserve prices for publishers in online ad auctions , 2019, 2019 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr).
[45] Tao Qin,et al. Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising , 2013, NIPS.
[46] S. Chib. Bayes inference in the Tobit censored regression model , 1992 .
[47] Thomas T. Hills,et al. The frequentist implications of optional stopping on Bayesian hypothesis tests , 2013, Psychonomic Bulletin & Review.
[48] Nan Jiang,et al. Doubly Robust Off-policy Value Evaluation for Reinforcement Learning , 2015, ICML.
[49] Ron Berman,et al. Test & Roll: Profit-Maximizing A/B Tests , 2019, Mark. Sci..
[50] Felix D. Schönbrodt,et al. Sequential Hypothesis Testing With Bayes Factors: Efficiently Testing Mean Differences , 2017, Psychological methods.
[51] D. Rubin,et al. Bayesian inference for causal effects in randomized experiments with noncompliance , 1997 .
[52] Stefan Wager,et al. Efficient Policy Learning , 2017, ArXiv.
[53] Martin J. Wainwright,et al. A framework for Multi-A(rmed)/B(andit) Testing with Online FDR Control , 2017, NIPS.
[54] R. Olsen,et al. Note on the Uniqueness of the Maximum Likelihood Estimator for the Tobit Model , 1978 .
[55] W. R. Thompson. ON THE LIKELIHOOD THAT ONE UNKNOWN PROBABILITY EXCEEDS ANOTHER IN VIEW OF THE EVIDENCE OF TWO SAMPLES , 1933 .
[56] L. Pekelis,et al. Always Valid Inference: Bringing Sequential Analysis to A/B Testing , 2015, 1512.04922.
[57] M. Mohri,et al. Bandit Problems , 2006 .
[58] Miroslav Dudík,et al. Optimal and Adaptive Off-policy Evaluation in Contextual Bandits , 2016, ICML.
[59] Alex Deng,et al. Continuous Monitoring of A/B Tests without Pain: Optional Stopping in Bayesian Testing , 2016, 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA).
[60] Michael Ostrovsky,et al. Reserve Prices in Internet Advertising Auctions: A Field Experiment , 2009, Journal of Political Economy.
[61] Elias Bareinboim,et al. Counterfactual Data-Fusion for Online Reinforcement Learners , 2017, ICML.
[62] C. Glymour,et al. STATISTICS AND CAUSAL INFERENCE , 1985 .
[63] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..
[64] Mohsen Bayati,et al. Online Decision-Making with High-Dimensional Covariates , 2015 .
[65] Lihong Li,et al. Counterfactual Estimation and Optimization of Click Metrics in Search Engines: A Case Study , 2015, WWW.
[66] Jacob D. Abernethy,et al. Dynamic Online Pricing with Incomplete Information , 2016 .
[67] Wim P. M. Vijverberg,et al. Measuring the unidentified parameter of the extended Roy model of selectivity , 1993 .
[68] Jun Wang,et al. Real-Time Bidding by Reinforcement Learning in Display Advertising , 2017, WSDM.
[69] Weinan Zhang,et al. Real-Time Bidding with Multi-Agent Reinforcement Learning in Display Advertising , 2018, CIKM.
[70] P. Grunwald,et al. Why optional stopping is a problem for Bayesians , 2017, 1708.08278.
[71] Claudio Gentile,et al. Ieee Transactions on Information Theory 1 Regret Minimization for Reserve Prices in Second-price Auctions , 2022 .
[72] A. Zeevi,et al. A Linear Response Bandit Problem , 2013 .