论文信息 - What is important about the No Free Lunch theorems?

What is important about the No Free Lunch theorems?

The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.

David H. Wolpert | D. Wolpert

[1] Leto Peel,et al. The ground truth about metadata and community detection in networks , 2016, Science Advances.

[2] Yuri Ermoliev,et al. Monte Carlo Optimization and Path Dependent Nonstationary Laws of Large Numbers , 1998 .

[3] David H. Wolpert. On The Bayesian Occam Factors Argument For Occam's Razor , 1992 .

[4] H WolpertDavid,et al. What makes an optimization problem hard , 1995 .

[5] James O. Berger,et al. Ockham's Razor and Bayesian Analysis , 1992 .

[6] P. Godfrey‐Smith. Theory and reality : an introduction to the philosophy of science , 2003 .

[7] David H. Wolpert,et al. The Existence of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[8] Dirk P. Kroese,et al. Cross‐Entropy Method , 2011 .

[9] L. Darrell Whitley,et al. A "No Free Lunch" Tutorial: Sharpened and Focused No Free Lunch , 2011, Theory of Randomized Search Heuristics.

[10] Tor Lattimore,et al. No Free Lunch versus Occam's Razor in Supervised Learning , 2011, Algorithmic Probability and Friends.

[11] David H. Wolpert,et al. The Relationship Between Occam's Razor and Convergent Guessing , 1990, Complex Syst..

[12] Marc Toussaint,et al. A No-Free-Lunch theorem for non-uniform distributions of target functions , 2004, J. Math. Model. Algorithms.

[13] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[14] David H. Wolpert,et al. Bias-Variance Techniques for Monte Carlo Optimization: Cross-validation for the CE Method , 2008, ArXiv.

[15] David H. Wolpert,et al. On Bias Plus Variance , 1997, Neural Computation.

[16] David H. Wolpert,et al. No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[17] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[18] David H. Wolpert,et al. The Relationship Between PAC, the Statistical Physics Framework, the Bayesian Framework, and the VC Framework , 1995 .

[19] Cullen Schaffer,et al. A Conservation Law for Generalization Performance , 1994, ICML.

[20] David H. Wolpert,et al. Coevolutionary free lunches , 2005, IEEE Transactions on Evolutionary Computation.

[21] S. Gull. Bayesian Inductive Inference and Maximum Entropy , 1988 .

[22] Paul A. Viola,et al. MIMIC: Finding Optima by Estimating Probability Densities , 1996, NIPS.

[23] Tobias J. Osborne,et al. No Free Lunch for Quantum Machine Learning , 2020, 2003.14103.