A Formal Model of Hierarchical Concept Learning

We show how to learn from examples (Valiant style) any concept representable as a boolean function or circuit, with the help of a teacher who breaks the concept into subconcepts and teaches one subconcept per lesson. Each subconcept corresponds to a gate in the boolean circuit. The learner learns each subconcept from examples which have been randomly drawn according to an arbitrary probability distribution, and labeled as positive or negative instances of the subconcept by the teacher. The learning procedure runs in time polynomial in the size of the circuit. The learner outputs not the unknown boolean circuit, but rather a program that, for any input, either produces the same answer as the unknown boolean circuit, or else says "I don?t know." Thus the output of this learning procedure is reliable. Furthermore, with high probability the output program is nearly always useful in that it says "I don?t know" very rarely. A key technique is to maintain a hierarchy of explicit "version spaces." Our main contribution is thus a learning procedure whose output is reliable and nearly always useful; this has not been previously accomplished within Valiant?s model of learnability.

[1]  David Haussler Quantifying the Inductive Bias in Concept Learning (Extended Abstract) , 1986, AAAI.

[2]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[3]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[4]  Peter Kugel,et al.  Induction, Pure and Simple , 1977, Inf. Control..

[5]  Silvio Micali,et al.  How to construct random functions , 1986, JACM.

[6]  M. Kearns,et al.  Recent Results on Boolean Concept Learning , 1987 .

[7]  Kurt VanLehn,et al.  Learning one Subprocedure per Lesson , 1987, Artif. Intell..

[8]  Carl H. Smith,et al.  Training Sequences , 1989, Theor. Comput. Sci..

[9]  Ming Li,et al.  Learning in the presence of malicious errors , 1993, STOC '88.

[10]  Robert H. Sloan,et al.  Corrigendum to types of noise in data for concept learning , 1988, COLT '92.

[11]  David Haussler Bias, Version Spaces and Valiant's Learning Framework , 1987 .

[12]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[13]  Tom M. Mitchell,et al.  Version Spaces: A Candidate Elimination Approach to Rule Learning , 1977, IJCAI.

[14]  Leslie G. Valiant,et al.  On the learnability of Boolean formulae , 1987, STOC.

[15]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[16]  Allen Newell,et al.  Towards Chunking as a General Learning Mechanism , 1984, AAAI.

[17]  Leslie G. Valiant,et al.  Computational limitations on learning from examples , 1988, JACM.