On the necessity of Occam algorithms

The distribution-independent model of concept learning from examples ("PAC-learning") due to Valiant [15] is investigated. It has been shown that the existence of an Occarn algorithm for a class of concepts is a sufficient condition for the PAC-learnability of that class [2, 3]. (An Occam algorithm is a randomized polynomial-time algorithm that, when given as input a sample of strings of some unknown concept to be learned, outputs a small description of a concept that is consistent with the sample.) In this paper it is shown that for all concept classes satisfying a natural closure property the converse is also true; the PAC-learnability of the class implies the existence of an Occam algorithm for the class. This results in a complete combinatorial characterization of the PAC-learnability of a wide variety of concept classes. 1 I n t r o d u c t i o n The distribution-independent model of concept learning (PA C-learning, for "probably approximately correct learning") was introduced by Valiant [15] and has been widely used to investigate the phenomenon of learning from examples (e.g., many of the papers contained in [7] and [12]). In this model a concept is a subset of a domain of elements, and a concept class is a set of such concepts. A learning algorithm is presented with a collection of domain elements, with each element labeled "Suppor ted in par t by NSF grant IRI-8809570. **Supported in past by NSF grant IRL8809570, and by the Depar tment of Compute r Science, Universi ty of Illinois at