On Learning in the Presence of Unspeci ed Attribute Values

We continue the study of learning in the presence of unspeciied attribute values (UAV) where some of the attributes of the examples may be unspeciied 9, 4]. A UAV assignment x 2 f0; 1; ?g n , where ? indicates unspeciied, is classiied positive (negative) with respect to a Boolean function f if all possible assignments for the unspeciied attributes result in a positive (negative) classiication. Otherwise, the classiication of x is ?. Given an example x 2 f0; 1; ?g n , the oracle UAV-MQ(x) responds with the classiication of x with respect to the unknown target. Given a hypothesis h, the oracle UAV-EQ returns an example x 2 f0; 1; ?g n for which h(x) is incorrect, if such an example exists. The new contributions of this paper are as follows. First we deene a new oracle called the relevant variable oracle, or RV, which takes as input a subcube of f0; 1g n and returns a relevant variable of the target on this subcube, if one exists. We then show that a class is query learnable using UAV-MQs if and only if it is query learnable using MQs and an RV. Next we give an almost tight lower bound for the number of UAV-MQs required to learn k-term DNF. After this we investigate the learnability of CDNF with UAV-MQs. The two main results of this particular investigation are 1) if class C is learnable as CDNF using MQs and EQs then it is learnable using UAV-MQs, and 2) CDNF is query learnable using only UAV-MQs (the algorithm is not time eecient). We then give eecient learning algorithms using UAV-MQs for the class of rank-k decision trees and the class of Boolean functions of a constant number of terms or clauses. The former of the two previous results leads to a quasi-polynomial time UAV-MQ algorithm for decision trees with polynomial size and CDNF with polynomial size. Finally, we answer an open problem posed in both 9] and 4] by showing that decision trees are learnable using UAV-MQs and UAV-EQs.