Bias in Algorithm Portfolio Performance Evaluation

A Virtual Best Solver (VBS) is a hypothetical algorithm that selects the best solver from a given portfolio of alternatives on a per-instance basis. The VBS idealizes performance when all solvers in a portfolio are run in parallel, and also gives a valuable bound on the performance of portfolio-based algorithm selectors. Typically, VBS performance is measured by running every solver in a portfolio once on a given instance and reporting the best performance over all solvers. Here, we argue that doing so results in a flawed measure that is biased to reporting better performance when a randomized solver is present in an algorithm portfolio. Specifically, this flawed notion of VBS tends to show performance better than that achievable by a perfect selector that for each given instance runs the solver with the best expected running time. We report results from an empirical study using solvers and instances submitted to several SAT competitions, in which we observe significant bias on many random instances and some combinatorial instances. We also show that the bias increases with the number of randomized solvers and decreases as we average solver performance over many independent runs per instance. We propose an alternative VBS performance measure by (1) empirically obtaining the solver with best expected performance for each instance and (2) taking bootstrap samples for this solver on every instance, to obtain a confidence interval on VBS performance. Our findings shed new light on widely studied algorithm selection benchmarks and help explain performance gaps observed between VBS and state-of-the-art algorithm selection approaches.

[1]  John R. Rice,et al.  The Algorithm Selection Problem , 1976, Adv. Comput..

[2]  Heike Trautmann,et al.  Improving the State of the Art in Inexact TSP Solving Using Per-Instance Algorithm Selection , 2015, LION.

[3]  Kevin Leyton-Brown,et al.  SATzilla: Portfolio-based Algorithm Selection for SAT , 2008, J. Artif. Intell. Res..

[4]  Yuri Malitsky,et al.  Non-Model-Based Algorithm Portfolios for SAT , 2011, SAT.

[5]  Bart Selman,et al.  Satisfiability Solvers , 2008, Handbook of Knowledge Representation.

[6]  Timo Berthold,et al.  Measuring the impact of primal heuristics , 2013, Oper. Res. Lett..

[7]  Torsten Schaub,et al.  AutoFolio: Algorithm Configuration for Algorithm Selection , 2015, AAAI Workshop: Algorithm Configuration.

[8]  Lars Kotthoff,et al.  Algorithm Selection for Combinatorial Search Problems: A Survey , 2012, AI Mag..

[9]  Luca Pulina,et al.  Collaborative Expert Portfolio Management , 2010, AAAI.

[10]  Yoav Shoham,et al.  A portfolio approach to algorithm select , 2003, IJCAI 2003.

[11]  Ivana Kruijff-Korbayová,et al.  A Portfolio Approach to Algorithm Selection , 2003, IJCAI.

[12]  Thomas Stützle,et al.  Towards a Characterisation of the Behaviour of Stochastic Local Search Algorithms for SAT , 1999, Artif. Intell..

[13]  Yuri Malitsky,et al.  Algorithm Selection and Scheduling , 2011, CP.

[14]  Geoff Sutcliffe,et al.  Evaluating general purpose automated theorem proving systems , 2001, Artif. Intell..

[15]  Yuri Malitsky,et al.  Algorithm Portfolios Based on Cost-Sensitive Hierarchical Clustering , 2013, IJCAI.

[16]  Bart Selman,et al.  Heavy-Tailed Distributions in Combinatorial Search , 1997, CP.

[17]  Armin Biere,et al.  A survey of recent advances in SAT-based formal verification , 2005, International Journal on Software Tools for Technology Transfer.

[18]  Jürgen Schmidhuber,et al.  Learning dynamic algorithm portfolios , 2006, Annals of Mathematics and Artificial Intelligence.

[19]  Bart Selman,et al.  Algorithm portfolios , 2001, Artif. Intell..

[20]  Kevin Leyton-Brown,et al.  Solving the Station Repacking Problem , 2016, AAAI.

[21]  Bart Selman,et al.  An Empirical Study of Optimal Noise and Runtime Distributions in Local Search , 2010, SAT.