Playing monotone games to understand learning behaviors

We deal with a special class of games against nature which correspond to subsymbolic learning problems where we know a local descent direction in the error landscape but not the amount gained at each step of the learning procedure. Namely, Alice and Bob play a game where the probability of victory grows monotonically by unknown amounts with the resources each employs. For a fixed effort on Alice's part Bob increases his resources on the basis of the results of the individual contests (victory, tie or defeat). Quite unlike the usual ones in game theory, his aim is to stop as soon as the defeat probability goes under a given threshold with high confidence. We adopt such a game policy as an archetypal remedy to the general overtraining threat of learning algorithms. Namely, we deal with the original game in a computational learning framework analogous to the Probably Approximately Correct formulation. Therein, a wise use of a special inferential mechanism (known as twisting argument) highlights relevant statistics for managing different trade-offs between observability and controllability of the defeat probability. With similar statistics we discuss an analogous trade-off at the basis of the stopping criterion of subsymbolic learning procedures. As a conclusion, we propose a principled stopping rule based solely on the behavior of the training session, hence without distracting examples into a test set.

[1]  Oscar H. Ibarra,et al.  Fast Approximation Algorithms for the Knapsack and Sum of Subset Problems , 1975, JACM.

[2]  Dean P. Foster,et al.  Regret in the On-Line Decision Problem , 1999 .

[3]  Bruno Apolloni,et al.  PAC Learning of Concept Classes Through the Boundaries of Their Items , 1997, Theor. Comput. Sci..

[4]  E. Kalai,et al.  Rational Learning Leads to Nash Equilibrium , 1993 .

[5]  J. Tukey Non-Parametric Estimation II. Statistically Equivalent Blocks and Tolerance Regions--The Continuous Case , 1947 .

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[7]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[8]  Feller William,et al.  An Introduction To Probability Theory And Its Applications , 1950 .

[9]  J. Nash,et al.  NON-COOPERATIVE GAMES , 1951, Classics in Game Theory.

[10]  Bruno Apolloni,et al.  A general framework for learning rules from data , 2004, IEEE Transactions on Neural Networks.

[11]  William J. Cook,et al.  Combinatorial optimization , 1997 .

[12]  A.N.M. Bazlur Rashid The 0-1 Knapsack Problem , 2010 .

[13]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[14]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[15]  Bruno Apolloni,et al.  Algorithmic Inference in Machine Learning , 2005, IEEE Transactions on Neural Networks.

[16]  Sartaj Sahni,et al.  Approximate Algorithms for the 0/1 Knapsack Problem , 1975, JACM.

[17]  Jean-Pierre Florens,et al.  Elements of Bayesian Statistics , 1990 .

[18]  Nicolò Cesa-Bianchi,et al.  Gambling in a rigged casino: The adversarial multi-armed bandit problem , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[19]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[20]  Vladimir Vapnik,et al.  Estimation of Dependences Based on Empirical Data: Springer Series in Statistics (Springer Series in Statistics) , 1982 .

[21]  Christos H. Papadimitriou,et al.  Computational complexity , 1993 .

[22]  R. Vohra,et al.  Calibrated Learning and Correlated Equilibrium , 1996 .

[23]  Robert H. Crites,et al.  Multiagent reinforcement learning in the Iterated Prisoner's Dilemma. , 1996, Bio Systems.

[24]  B. M. Hill,et al.  Theory of Probability , 1990 .

[25]  F. Downton,et al.  Nonparametric Methods in Statistics , 1959 .

[26]  F ROSENBLATT,et al.  The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[27]  Ronald L. Rivest,et al.  Training a 3-node neural network is NP-complete , 1988, COLT '88.

[28]  Tilman Börgers,et al.  Learning Through Reinforcement and Replicator Dynamics , 1997 .

[29]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[30]  Douglas Gale,et al.  Monotone Games with Positive Spillovers , 2001, Games Econ. Behav..

[31]  Bruno Apolloni,et al.  From synapses to rules , 2002, Cognitive Systems Research.

[32]  Bruno Apolloni,et al.  Gaining degrees of freedom in subsymbolic learning , 2001, Theor. Comput. Sci..

[33]  Nathan Intrator,et al.  Forward and Backward Selection in Regression Hybrid Network , 2002, Multiple Classifier Systems.

[34]  L. M. M.-T. Theory of Probability , 1929, Nature.

[35]  R. Fisher THE FIDUCIAL ARGUMENT IN STATISTICAL INFERENCE , 1935 .

[36]  P. Rousseeuw,et al.  Wiley Series in Probability and Mathematical Statistics , 2005 .

[37]  S. Kullback,et al.  Information Theory and Statistics , 1959 .

[38]  Sartaj Sahni Some Related Problems from Network Flows, Game Theory and Integer Programming , 1972, SWAT.

[39]  R. F.,et al.  Mathematical Statistics , 1944, Nature.

[40]  Avrim Blum,et al.  On-line Algorithms in Machine Learning , 1996, Online Algorithms.

[41]  Ingo Wegener,et al.  The complexity of Boolean functions , 1987 .

[42]  A. Roth,et al.  Learning in Extensive-Form Games: Experimental Data and Simple Dynamic Models in the Intermediate Term* , 1995 .

[43]  Giorgio Gambosi,et al.  Complexity and approximation: combinatorial optimization problems and their approximability properties , 1999 .