Preferential Bayesian optimisation with skew gaussian processes

Bayesian optimisation (BO) is a very effective approach for sequential black-box optimization where direct queries of the objective function are expensive. However, there are cases where the objective function can only be accessed via preference judgments, such as "this is better than that" between two candidate solutions (like in A/B tests or recommender systems). The state-of-the-art approach to Preferential Bayesian Optimization (PBO) uses a Gaussian process to model the preference function and a Bernoulli likelihood to model the observed pairwise comparisons. Laplace's method is then employed to compute posterior inferences and, in particular, to build an appropriate acquisition function. In this paper, we prove that the true posterior distribution of the preference function is a Skew Gaussian Process (SkewGP), with highly skewed pairwise marginals and, thus, show that Laplace's method usually provides a very poor approximation. We then derive an efficient method to compute the exact SkewGP posterior and use it as surrogate model for PBO employing standard acquisition functions (Upper Credible Bound, etc.). We illustrate the benefits of our exact PBO-SkewGP in a variety of experiments, by showing that it consistently outperforms PBO based on Laplace's approximation both in terms of convergence speed and computational time. We also show that our framework can be extended to deal with mixed preferential-categorical BO, typical for instance in smart manufacturing, where binary judgments (valid or non-valid) together with preference judgments are available.

[1]  Alan Genz,et al.  Bivariate conditioning approximations for multivariate normal probabilities , 2015, Stat. Comput..

[2]  Zoubin Ghahramani,et al.  Bayesian Active Learning for Classification and Preference Learning , 2011, ArXiv.

[3]  Dan Siroker,et al.  A/B Testing: The Most Powerful Way to Turn Clicks Into Customers , 2013 .

[4]  Anca D. Dragan,et al.  Active Preference-Based Learning of Reward Functions , 2017, Robotics: Science and Systems.

[5]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[6]  Philipp Hennig,et al.  Integrals over Gaussians under Linear Domain Constraints , 2020, AISTATS.

[7]  Joel W. Burdick,et al.  Stagewise Safe Bayesian Optimization with Gaussian Processes , 2018, ICML.

[8]  Neil D. Lawrence,et al.  Preferential Bayesian Optimization , 2017, ICML.

[9]  L. Thurstone A law of comparative judgment. , 1994 .

[10]  Adelchi Azzalini,et al.  The Skew-Normal and Related Families , 2018 .

[11]  Alkis Gotovos,et al.  Safe Exploration for Optimization with Gaussian Processes , 2015, ICML.

[12]  Alessio Benavoli,et al.  Skew Gaussian processes for classification , 2020, Machine Learning.

[13]  Ryan P. Adams,et al.  Elliptical slice sampling , 2009, AISTATS.

[14]  M. de Rijke,et al.  Relative Upper Confidence Bound for the K-Armed Dueling Bandit Problem , 2013, ICML.

[15]  Wei Chu,et al.  Preference learning with Gaussian processes , 2005, ICML.

[16]  M. Sasena,et al.  Exploration of Metamodeling Sampling Criteria for Constrained Global Optimization , 2002 .

[17]  Nando de Freitas,et al.  Active Preference Learning with Discrete Choice Data , 2007, NIPS.

[18]  Alberto Bemporad,et al.  Active preference learning based on radial basis functions , 2019, ArXiv.

[19]  A. O'Hagan,et al.  Bayes estimation subject to uncertainty about parameter constraints , 1976 .

[20]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[21]  U. Böckenholt,et al.  Choice overload: A conceptual review and meta-analysis , 2015 .

[22]  R. Arellano-Valle,et al.  On the Unification of Families of Skew‐normal Distributions , 2006 .

[23]  J. Mockus Bayesian Approach to Global Optimization: Theory and Applications , 1989 .