Hydra-mm: Learning Multiple Descriptions to Improve Classification Accuracy

For learning tasks with few examples, greater classification accuracy can be achieved by learning several concept descriptions for each class in the data and producing a classification that combines evidence from multiple descriptions. Stochastic (randomized) search can be used to generate many concept descriptions for each class. Here we use a tractable approximation to the optimal Bayesian method for combining evidence from multiple descriptions. Learning multiple descriptions is very useful when additional data is difficult to obtain. The primary result of this paper is that multiple concept descriptions are particularly helpful for improving accuracy in hypothesis spaces in which there are many equally good rules to learn. Another result is experimental evidence that learning multiple rule sets yields more accurate classifications than learning multiple rules for concepts containing many disjuncts.