Learning from Examples and Membership Queries with Structured Determinations

It is well known that prior knowledge or bias can speed up learning, at least in theory. It has proved difficult to make constructive use of prior knowledge, so that approximately correct hypotheses can be learned efficiently. In this paper, we consider a particular form of bias which consists of a set of “determinations.” A set of attributes is said to determine a given attribute if the latter is purely a function of the former. The bias is tree-structured if there is a tree of attributes such that the attribute at any node is determined by its children, where the leaves correspond to input attributes and the root corresponds to the target attribute for the learning problem. The set of allowed functions at each node is called the basis. The tree-structured bias restricts the target functions to those representable by a read-once formula (a Boolean formula in which each variable occurs at most once) of a given structure over the basis functions. We show that efficient learning is possible using a given tree-structured bias from random examples and membership queries, provided that the basis class itself is learnable and obeys some mild closure conditions. The algorithm uses a form of controlled experimentation in order to learn each part of the overall function, fixing the inputs to the other parts of the function at appropriate values. We present empirical results showing that when a tree-structured bias is available, our method significantly improves upon knowledge-free induction. We also show that there are hard cryptographic limitations to generalizing these positive results to structured determinations in the form of a directed acyclic graph.

[1]  Raymond J. Mooney,et al.  Extending Theory Refinement to M-of-N Rules , 1993, Informatica.

[2]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[3]  Sturart J. Russell,et al.  The use of knowledge in analogy and induction , 1989 .

[4]  Ingo Wegener,et al.  The complexity of Boolean functions , 1987 .

[5]  Ron Kohavi,et al.  Research Note on Decision Lists , 1993, Machine Learning.

[6]  Nir Friedman,et al.  Learning Belief Networks in the Presence of Missing Values and Hidden Variables , 1997, ICML.

[7]  David Haussler,et al.  Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[8]  Lisa Hellerstein,et al.  Learning Boolean Read-Once Formulas over Generalized Bases , 1995, J. Comput. Syst. Sci..

[9]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[10]  Thomas R. Hancock,et al.  Learning 2u DNF formulas and ku decision trees , 1991, COLT 1991.

[11]  Stephen Muggleton,et al.  Efficient Induction of Logic Programs , 1990, ALT.

[12]  Haym Hirsh Knowledge as Bias , 1990 .

[13]  Thomas R. Hancock,et al.  Learning 2µ DNF Formulas and kµ Decision Trees , 1991, COLT.

[14]  M. Pazzani,et al.  The Utility of Knowledge in Inductive Learning , 1992, Machine Learning.

[15]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[16]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[17]  Prasad Tadepalli,et al.  Learning from Queries and Examples with Tree-structured Bias , 1993, ICML.

[18]  Benjamin N. Grosof,et al.  A Declarative Approach to Bias in Concept Learning , 1987, AAAI.

[19]  William J. Clancey,et al.  Knowledge Base Refinement by Monitoring Abstract Control Knowledge , 1987, Int. J. Man Mach. Stud..

[20]  William W. Cohen Compiling prior knowledge into an explicit basis , 1992, ICML 1992.

[21]  Vijay Raghavan,et al.  Read-Twice DNF Formulas are Properly Learnable , 1994, Inf. Comput..

[22]  Lisa Hellerstein,et al.  Learning read-once formulas over fields and extended bases , 1991, COLT '91.

[23]  William W. Cohen The Generality of Overgenerality , 1991, ML.

[24]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[25]  Stuart J. Russell,et al.  Local Learning in Probabilistic Networks with Hidden Variables , 1995, IJCAI.

[26]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[27]  Stuart J. Russell Tree-Structured Bias , 1988, AAAI.

[28]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[29]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[30]  Dana Angluin,et al.  When won't membership queries help? , 1991, STOC '91.

[31]  Lisa Hellerstein,et al.  Learning Boolean read-once formulas with arbitrary symmetric and constant fan-in gates , 1992, COLT '92.

[32]  Benjamin N. Grosof,et al.  Declarative Bias: An Overview , 1990 .

[33]  H. Aizenstein,et al.  Exact learning of read-twice DNF formulas , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[34]  S. Mahadevan,et al.  Quantifying Prior Determination Knowledge Using the PAC Learning Model , 1994, Machine Learning.

[35]  Lisa Hellerstein,et al.  Learning Arithmetic Read-Once Formulas , 1995, SIAM J. Comput..

[36]  Marek Karpinski,et al.  Learning read-once formulas with queries , 1993, JACM.

[37]  Jude Shavlik,et al.  An Approach to Combining Explanation-based and Neural Learning Algorithms , 1989 .

[38]  D. Angluin Queries and Concept Learning , 1988 .