Learning and Revising Theories in Noisy Domains

This paper describes an approach to learning from noisy examples with an approximate theory. The approach includes a theory preference criterion and an overfitting avoidance strategy. The theory preference criterion is a coding scheme which extends the minimum description length (MDL) principle by unifying model complexity and exception cost. Model complexity is the encoding cost for an algorithm to obtain a logic program; exception cost is the encoding length of the training examples misclassified by a theory. When the system learns from the remainder of the training set, it adopts a kind of overfitting avoidance technique, induces thus more accurate clauses. Accounting for the above cases, our approach appears to be more accurate and efficient compared with existing approaches.