论文信息 - Optimization Experiments in the Continuous Space - The Limited Growth Optimistic Optimization Algorithm

Optimization Experiments in the Continuous Space - The Limited Growth Optimistic Optimization Algorithm

Online controlled experiments are extensively used by web-facing companies to validate and optimize their systems, providing a competitive advantage in their business. As the number of experiments scale, companies aim to invest their experimentation resources in larger feature changes and leave the automated techniques to optimize smaller features. Optimization experiments in the continuous space are encompassed in the many-armed bandits class of problems. Although previous research provides algorithms for solving this class of problems, these algorithms were not implemented in real-world online experimentation problems and do not consider the application constraints, such as time to compute a solution, selection of a best arm and the estimation of the mean-reward function. This work discusses the online experiments in context of the many-armed bandits class of problems and provides three main contributions: (1) an algorithm modification to include online experiments constraints, (2) implementation of this algorithm in an industrial setting in collaboration with Sony Mobile, and (3) statistical evidence that supports the modification of the algorithm for online experiments scenarios. These contributions support the relevance of the LG-HOO algorithm in the context of optimization experiments and show how the algorithm can be used to support continuous optimization of online systems in stochastic scenarios.

[1] Csaba Szepesvári,et al. –armed Bandits , 2022 .

[2] Robert F. Bordley,et al. Morphing Banner Advertising , 2014, Mark. Sci..

[3] Ron Kohavi,et al. Seven rules of thumb for web site experimenters , 2014, KDD.

[4] Laurie A. Williams,et al. Characterizing Experimentation in Continuous Deployment: A Case Study on Bing , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP).

[5] Philipp Leitner,et al. Continuous Experimentation: Challenges, Implementation Techniques, and Current Research , 2018, IEEE Software.

[6] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[7] Xian Wu,et al. Measuring Metrics , 2016, CIKM.

[8] Jan Bosch,et al. The Evolution of Continuous Experimentation in Software Product Development: From Data to a Data-Driven Organization at Scale , 2017, 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE).

[9] Ya Xu,et al. SQR: Balancing Speed, Quality and Risk in Online Experiments , 2018, KDD.

[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[11] Aleksander Fabijan. Developing the right features: the role and impact of customer and product data in software product development , 2016 .

[12] Jason L. Loeppky,et al. A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit , 2015, ArXiv.

[13] Ashish Agarwal,et al. Overlapping experiment infrastructure: more, better, faster experimentation , 2010, KDD.

[14] Rémi Munos,et al. Algorithms for Infinitely Many-Armed Bandits , 2008, NIPS.

[15] Jan Bosch,et al. More for Less: Automated Experimentation in Software-Intensive Systems , 2017, PROFES.

[16] A. Savitzky,et al. Smoothing and Differentiation of Data by Simplified Least Squares Procedures. , 1964 .

[17] Anmol Bhasin,et al. Network A/B Testing: From Sampling to Estimation , 2015, WWW.

[18] Joaquin Quiñonero Candela,et al. Counterfactual reasoning and learning systems: the example of computational advertising , 2012, J. Mach. Learn. Res..

[19] Giordano Tamburrelli,et al. Towards Automated A/B Testing , 2014, SSBSE.

[20] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.

[21] H. H. Madden. Comments on the Savitzky-Golay convolution method for least-squares-fit smoothing and differentiation of digital data , 1976 .

[22] Jan Bosch,et al. The Benefits of Controlled Experimentation at Scale , 2017, 2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[23] Jan Bosch,et al. Your System Gets Better Every Day You Use It: Towards Automated Continuous Experimentation , 2017, 2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA).

[24] Michael S. Bernstein,et al. Designing and deploying online field experiments , 2014, WWW.

[25] Ron Kohavi,et al. Responsible editor: R. Bayardo. , 2022 .

[26] Wei Chu,et al. A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.