Ubiquity symposium: Evolutionary computation and the processes of life: what the no free lunch theorems really mean: how to improve search algorithms

The first No Free Lunch (NFL) theorems were introduced in [9], in the contextof supervised machine learning. These theorems were then popularized in [8],based on a preprint version of [9]. Loosely speaking, these original theorems canbe viewed as a formalization and elaboration of concerns about the legitimacyof inductive inference, concerns that date back to David Hume (if not earlier).Shortly after these original theorems were published, additional NFL theoremsthat apply to search were introduced in [12].The NFL theorems have stimulated lots of subsequent work, with over 2500citations of [12] alone by spring 2012 according to Google Scholar. However ar-guably much of that research has missed the most important implications of thetheorems. As stated in [12], the primary importance of the NFL theorems forsearch is what they tell us about “the underlying mathematical ‘skeleton’ of op-timization theory before the ‘flesh’ of the probability distributions of a particularcontext and set of optimization problems are imposed”. So in particular, while theNFL theorems have strong implications if one believes in a uniform distributionover optimization problems, in no sense should they be interpreted as advocatingsuch a distribution.1

[1]  David H. Wolpert,et al.  On Bias Plus Variance , 1997, Neural Computation.

[2]  David H. Wolpert,et al.  The Existence of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[3]  David H. Wolpert,et al.  Bias-Variance Techniques for Monte Carlo Optimization: Cross-validation for the CE Method , 2008, ArXiv.

[4]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[5]  Cullen Schaffer,et al.  A Conservation Law for Generalization Performance , 1994, ICML.

[6]  Lih-Yuan Deng,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.

[7]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[8]  David H. Wolpert,et al.  On the Connection between In-sample Testing and Generalization Error , 1992, Complex Syst..

[9]  David H. Wolpert,et al.  Coevolutionary free lunches , 2005, IEEE Transactions on Evolutionary Computation.

[10]  Yuri Ermoliev,et al.  Monte Carlo Optimization and Path Dependent Nonstationary Laws of Large Numbers , 1998 .

[11]  David H. Wolpert,et al.  What makes an optimization problem hard? , 1995, Complex..

[12]  Paul A. Viola,et al.  MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[13]  Dirk P. Kroese,et al.  Cross‐Entropy Method , 2011 .

[14]  David H. Wolpert,et al.  Probability Collectives in Optimization , 2013 .

[15]  Dirk P. Kroese,et al.  The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .