论文信息 - Automated feature extraction for supervised learning

Automated feature extraction for supervised learning

Feature extraction has traditionally been a manual process and something of an art. Methods derived from statistics and linear systems theory have been proposed, but by general consensus effective feature extraction remains a difficult problem. Recently W. Tackett (1993) showed that genetic programming (GP) can be effective in automatically constructing features for identifying potential targets in digital images with high accuracy. From a basis set of simple arithmetic functions, he was able to construct numerical features that outperformed manually-constructed features when used as inputs to several classifiers, including a binary-tree classifier and a multi-layer perceptron trained by back-propagation. Seeking a more generic feature-construction procedure, we developed a GP-based algorithm to extract features in a variety of domains and for most classification methods, including decision trees, feed-forward neural networks, and Bayesian classifiers. We have tested the technique with success by extracting features for three different types of problems: Boolean functions with binary features, a NASA telemetry problem with multiple classes and real-valued time-series inputs, and a wine variety classification problem with real-valued features from the UCI Machine Learning repository. We formally define the feature-construction method and show in some detail how it applies to specific classification problems.<<ETX>>

Ronald Saul | Philip D. Laird

[1] J. Ross Quinlan,et al. Learning Efficient Classification Procedures and Their Application to Chess End Games , 1983 .

[2] Laveen N. Kanal,et al. Problem-Solving Models and Search Strategies for Pattern Recognition , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3] David Haussler,et al. Learnability and the Vapnik-Chervonenkis dimension , 1989, JACM.

[4] Hitoshi Iba,et al. Genetic programming using a minimum description length principle , 1994 .

[5] HausslerDavid,et al. Boolean Feature Discovery in Empirical Learning , 1990 .

[6] Walter Alden Tackett,et al. Genetic Programming for Feature Discovery and Image Discrimination , 1993, ICGA.