Feature construction: an analytic framework and an application to decision trees
暂无分享,去创建一个
While similarity-based learning (SBL) methods can be effective for acquiring concept descriptions from labeled examples, their success largely depends upon the quality of the features used to describe the examples. When a learning problem uses low-level features, the complexity of the concept-membership function can make SBL inaccurate, expensive, or simply impossible. One way to overcome this limitation is through feature construction: the construction of new features by the application of constructive operators to existing features. Feature construction can result in an improved instance space in which the concept-membership function is better behaved relative to the inductive biases of SBL algorithms. Feature construction, however, is computationally difficult, primarily because of the intractably large space of potential new features. To assist in the study and advancement of feature construction methods, this thesis presents a feature construction framework based on the aspects of (1) need detection, (2) constructor selection, (3) constructor generalization, and (4) feature evaluation. This framework was used to analyze eight existing systems (BACON, BOGART, DUCE, FRINGE, MIRO, PLSO, STAGGER, and STABB) and to identify promising approaches to feature construction. The framework also served as the basis for the design of CITRE, an inductive system that constructs new features using decision tress. CITRE was tested on five learning problems: l-term kDNF Boolean functions, tic-tac-toe classification, mushroom classification, voting-record classification, and chess-end-game classification. The results demonstrate CITRE's potential for significantly improving hypothesis accuracy and conciseness. The results also reveal substantial benefits obtainable by using simple domain-knowledge constraints and constructor generalization during feature construction.