Characterizations of Learnability for Classes of {0, ..., n}-Valued Functions

We investigate the PAC learnability of classes of {0, ..., n}-valued functions (n 1 several generalizations of the VC-dimension, each yielding a distinct characterization of learnability, have been proposed by a number of researchers. In this paper we present a general scheme for extending the VC-dimension to the case n > 1. Our scheme defines a wide variety of notions of dimension in which all these variants of the VC-dimension, previously introduced in the context of learning, appear as special cases. Our main result is a simple condition characterizing the set of notions of dimension whose finiteness is necessary and sufficient for learning. This provides a variety of new tools for determining the learnability of a class of multi-valued functions. Our characterization is also shown to hold in the "robust" variant of PAC model and for any "reasonable" loss function.

[1]  J. Lamperti ON CONVERGENCE OF STOCHASTIC PROCESSES , 1962 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Mark G. Karpovsky,et al.  Coordinate density of sets of vectors , 1978, Discret. Math..

[4]  J. Michael Steele,et al.  Existence of Submatrices with All Possible Columns , 1978, Journal of combinatorial theory. Series A.

[5]  Noga Alon,et al.  On the density of sets of vectors , 1983, Discret. Math..

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[7]  R. Dudley Universal Donsker Classes and Metric Entropy , 1987 .

[8]  Leslie G. Valiant,et al.  A general lower bound on the number of examples needed for learning , 1988, COLT '88.

[9]  Richard P. Anstee,et al.  A forbidden configuration theorem of Alon , 1988, J. Comb. Theory, Ser. A.

[10]  Vladimir Vapnik,et al.  Inductive principles of the search for empirical dependences (methods based on weak convergence of probability measures) , 1989, COLT '89.

[11]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[12]  D. Pollard Empirical Processes: Theory and Applications , 1990 .

[13]  Translator-IEEE Expert staff Machine Learning: A Theoretical Approach , 1992, IEEE Expert.

[14]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[15]  John Shawe-Taylor,et al.  Bounding Sample Size with the Vapnik-Chervonenkis Dimension , 1993, Discrete Applied Mathematics.

[16]  Philip M. Long,et al.  Fat-shattering and the learnability of real-valued functions , 1994, COLT '94.

[17]  Philip M. Long,et al.  A Generalization of Sauer's Lemma , 1995, J. Comb. Theory A.