论文信息 - Counterfactual reasoning and learning systems: the example of computational advertising - 字舞流文

Counterfactual reasoning and learning systems: the example of computational advertising

This work shows how to leverage causal inference to understand the behavior of complex learning systems interacting with their environment and predict the consequences of changes to the system. Such predictions allow both humans and algorithms to select the changes that would have improved the system performance. This work is illustrated by experiments on the ad placement system associated with the Bing search engine.

Joaquin Quiñonero Candela | David Maxwell Chickering | Elon Portugaly | Patrice Y. Simard | Léon Bottou | Jonas Peters | Dipankar Ray | Denis Xavier Charles | Ed Snelson | L. Bottou | J. Peters | P. Simard | D. M. Chickering | J. Q. Candela | Edward Snelson | Denis Xavier Charles | Elon Portugaly | Dipankar Ray

[1] F. H. Adler. Cybernetics, or Control and Communication in the Animal and the Machine. , 1949 .

[2] Norbert Wiener,et al. Cybernetics: Control and Communication in the Animal and the Machine. , 1949 .

[3] E. H. Simpson,et al. The Interpretation of Interaction in Contingency Tables , 1951 .

[4] Norbert Wiener,et al. Cybernetics, Second Edition: or the Control and Communication in the Animal and the Machine , 1965 .

[5] R. Khan,et al. Sequential Tests of Statistical Hypotheses. , 1972 .

[6] G. Wright,et al. Explanation and understanding , 1971 .

[7] Vladimir Vapnik,et al. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[8] Norbert Sauer,et al. On the Density of Families of Sets , 1972, J. Comb. Theory, Ser. A.

[9] J. Gittins. Bandit processes and dynamic allocation indices , 1979 .

[10] Roger B. Myerson,et al. Optimal Auction Design , 1981, Math. Oper. Res..

[11] D. A. Kenny,et al. Correlation and Causation , 1937, Wilmott.

[12] D. A. Kenny,et al. Correlation and Causation. , 1982 .

[13] Vladimir Vapnik,et al. Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .

[14] P. Masani. Norbert Wiener , 1983 .

[15] C. Charig,et al. Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. , 1986, British medical journal.

[16] Peter W. Glynn,et al. Likelilood ratio gradient estimation: an overview , 1987, WSC '87.

[17] S. Stigler. A Historical View of Statistical Concepts in Psychology and Educational Research , 1992, American Journal of Education.

[18] A. Genz. Numerical Computation of Multivariate Normal Probabilities , 1992 .

[19] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[20] L. Reichl,et al. A Modern Course in Statistical Physics, 2nd Edition , 1998 .

[21] J. Robins,et al. Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[22] H. Shimodaira,et al. Improving predictive inference under covariate shift by weighting the log-likelihood function , 2000 .

[23] E. M. Lifshitz,et al. Course in Theoretical Physics , 2013 .

[24] J. Pearl. Causality: Models, Reasoning and Inference , 2000 .

[25] D. Allen. Making things happen. , 2000, Nursing standard (Royal College of Nursing (Great Britain) : 1987).

[26] S. Morris. COWLES FOUNDATION FOR RESEARCH IN ECONOMICS , 2001 .

[27] Marek J. Druzdzel,et al. Caveats for Causal Reasoning with Equilibrium Models , 2001, ECSQARU.

[28] Leo Breiman,et al. Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001, Statistical Science.

[29] S. Lauritzen,et al. Chain graph models and their causal interpretations , 2002 .

[30] Tom Burr,et al. Causation, Prediction, and Search , 2003, Technometrics.

[31] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[32] P. Spirtes,et al. Causal Inference of Ambiguous Manipulations , 2004, Philosophy of Science.

[33] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[34] Paul Milgrom,et al. Putting Auction Theory to Work , 2004 .

[35] Mehryar Mohri,et al. Multi-armed Bandit Algorithms and Empirical Evaluation , 2005, ECML.

[36] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[37] J. Robins,et al. Doubly Robust Estimation in Missing Data and Causal Inference Models , 2005, Biometrics.

[38] V. Vapnik. Estimation of Dependences Based on Empirical Data , 2006 .

[39] F. Keil. Explanation and understanding. , 2006, Annual review of psychology.

[40] Csaba Szepesvári,et al. Tuning Bandit Algorithms in Stochastic Environments , 2007, ALT.

[41] Klaus-Robert Müller,et al. Covariate Shift Adaptation by Importance Weighted Cross Validation , 2007, J. Mach. Learn. Res..

[42] John Langford,et al. The Epoch-Greedy Algorithm for Multi-armed Bandits with Side Information , 2007, NIPS.

[43] H. Robbins. Some aspects of the sequential design of experiments , 1952 .

[44] Ron Kohavi,et al. Controlled experiments on the web: survey and practical guide , 2009, Data Mining and Knowledge Discovery.

[45] H. Varian. Online Ad Auctions , 2009 .

[46] Massimiliano Pontil,et al. Empirical Bernstein Bounds and Sample-Variance Penalization , 2009, COLT.

[47] J. Pearl. Causal inference in statistics: An overview , 2009 .

[48] Ashish Agarwal,et al. Overlapping experiment infrastructure: more, better, faster experimentation , 2010, KDD.

[49] Joaquin Quiñonero Candela,et al. Web-Scale Bayesian Click-Through rate Prediction for Sponsored Search Advertising in Microsoft's Bing Search Engine , 2010, ICML.

[50] M. I. Jordan. Leo Breiman , 2011, 1101.0929.

[51] D. Bergemann,et al. Dynamic Auctions: A Survey , 2010 .

[52] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[53] S. Athey,et al. A Structural Model of Sponsored Search Advertising Auctions , 2011 .

[54] R. Preston McAfee,et al. Efficient Ranking in Sponsored Search , 2011, WINE.

[55] Lihong Li,et al. An Empirical Evaluation of Thompson Sampling , 2011, NIPS.

[56] Aleksandrs Slivkins,et al. Contextual Bandits with Similarity Information , 2009, COLT.

[57] Wei Chu,et al. Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[58] Judea Pearl,et al. The Do-Calculus Revisited , 2012, UAI.

[59] John Langford,et al. Sample-efficient Nonstationary Policy Evaluation for Contextual Bandits , 2012, UAI.

[60] John Shawe-Taylor,et al. PAC-Bayesian Inequalities for Martingales , 2011, IEEE Transactions on Information Theory.

[61] Léon Bottou,et al. From machine learning to machine reasoning , 2011, Machine Learning.

[62] Doina Precup,et al. Algorithms for multi-armed bandit problems , 2014, ArXiv.

[63] P. Glynn. LIKELIHOOD RATIO GRADIENT ESTIMATION : AN OVERVIEW by , 2022 .