Mining for Features to Improve Classification

When first faced with a learning task, it is often not clear what a satisfactory representation of the training data should be, and we are often forced to create some set of features that appear plausible, without any strong confidence that they will yield superior learning. Moreover, we often do not have any prior knowledge of what learning method is best to apply, and thus often try multiple methods in an attempt to find the one that performs best. This paper describes a method called Feature-mine, that takes a set of features and augments them with macro-features that test for the occurrence of combinations of values on the original features. Our approach uses associations that are mined from the data to create these new features. Importantly, the method is independent of any learning method, with the creation and selection of new features based simply on the relationship between the attributes and the class labels in