On exact specification by examples

Some recent work [7, 14, 15] in computational learning theory has discussed learning in situations where the teacher is helpful, and can choose to present carefully chosen sequences of labelled examples to the learner. We say a function <italic>t</italic> in a set <italic>H</italic> of functions (a hypothesis space) defined on a set <italic>X</italic> is <italic>specified</italic> by <italic>S</italic>***<italic>X</italic> if the only function in <italic>H</italic> which agrees with <italic>t</italic> on <italic>S</italic> is <italic>t</italic> itself. The <italic>specification number &sgr;(t)</italic> of <italic>t</italic> is the least cardinality of such an <italic>S</italic>. For a general hypothesis space, we show that the specification number of any hypotheis is at least equal to a parameter from [14] known as the testing dimension of <italic>H</italic>. We investigate in some detail the specification numbers of hypotheses in the set <italic>H<subscrpt>n</subscrpt></italic> of linearly separable boolean functions: We present general methods for finding upper bounds on <italic>&sgr;(t)</italic> and we characterise those <italic>t</italic> which have largest <italic>&sgr;(t)</italic>. We obtain a general lower bound on the number of examples required and we show that for all <italic>nested</italic> hypotheses, this lower bound is attained. We prove that for any <italic>t ε H<subscrpt>n</subscrpt></italic>, there is <italic>exactly one</italic> set of examples of minimal cardinality (i.e., of cardinality <italic>&sgr;(t))</italic> which specifies <italic>t</italic>. We then discuss those <italic>t ε H<subscrpt>n</subscrpt></italic> which have limited dependence, in the sense that some of the variables are redundant (i.e., there are irrelevant attributes), giving tight upper and lower bounds on <italic>&sgr;(t)</italic> for such hypotheses. In the final section of the paper, we address the complexity of computing specification numbers and related parameters.

[1]  E. Sperner Ein Satz über Untermengen einer endlichen Menge , 1928 .

[2]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[3]  Saburo Muroga,et al.  Threshold logic and its applications , 1971 .

[4]  Norbert Sauer,et al.  On the Density of Families of Sets , 1972, J. Comb. Theory A.

[5]  David S. Johnson,et al.  Approximation algorithms for combinatorial problems , 1973, STOC.

[6]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[7]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[8]  B. Bollobás Combinatorics: Set Systems, Hypergraphs, Families of Vectors and Combinatorial Probability , 1986 .

[9]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[10]  D. Angluin Queries and Concept Learning , 1988 .

[11]  Richard Statman,et al.  Inductive inference: an abstract approach , 1988, Annual Conference Computational Learning Theory.

[12]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[13]  R. Schapire,et al.  Exact identification of circuits using fixed points of amplification functions , 1990, Proceedings [1990] 31st Annual Symposium on Foundations of Computer Science.

[14]  Michael Kearns,et al.  On the complexity of teaching , 1991, COLT '91.

[15]  Kathleen Romanik,et al.  Testing as a dual to learning , 1991 .

[16]  Martin Anthony,et al.  Computational learning theory: an introduction , 1992 .

[17]  Carl Smith,et al.  Testing Geometric Objects , 1994, Comput. Geom..