论文信息 - Policies for the selection of bias in inductive machine learning

Policies for the selection of bias in inductive machine learning

Recently, machine learning research has concentrated more and more on inductive bias--the choices made in designing and setting up a learning program that lead it to choose one generalization of the data over another. Unfortunately, there cannot be a single bias that is satisfactory for all learning problems. In order for machine learning to be useful to the non-researcher, as a tool for knowledge acquisition, for example, novel pragmatic considerations should not necessitate substantial system building. In this dissertation I clarify the concept of inductive bias by: (i) extending the generally accepted model, and (ii) separating out inductive policy. I extend the model to include the example space and the method of searching the example space, in addition to the rule space and the method of searching it, and I make explicit relations between the spaces. Inductive policy comprises the pragmatic considerations affecting the design and implementation of a learning system, which guide the selection of the syntactic and semantic components of inductive bias. Concentrating on inductive policy allows me to identify and characterize several general techniques for selecting inductive bias, which are useful in the analysis of existing bias selection systems and (specific instantiations) can be used as a tool kit for building new bias selection systems. A theoretical analysis of one family of inductive policies shows that under certain conditions iterative weakening is an optimal or near-optimal policy for bias selection. This contribution extends beyond machine learning; in particular, I design a near-optimal policy for selecting both the depth and breadth for a depth-first search. Finally, I demonstrate that different inductive policies can be easily and successfully implemented in a single system, addressing different (and novel) pragmatic constraints. In the SBS Testbed, I implement policies successfully for: learning with high classification accuracy, learning subject to time constraints, learning subject to space constraints, and learning with sensitivity to the cost of errors. Modeling bias selection as a search problem is a key in facilitating the implementation of these policies; changing policy entails changing only the set of operators and/or the evaluation function.

John Foster Provost | J. Provost