Economical active feature-value acquisition through Expected Utility estimation

In many classification tasks training data have missing feature values that can be acquired at a cost. For building accurate predictive models, acquiring all missing values is often prohibitively expensive or unnecessary, while acquiring a random subset of feature values may not be most effective. The goal of active feature-value acquisition is to incrementally select feature values that are most cost-effective for improving the model's accuracy. We present two policies, Sampled Expected Utility and Expected Utility-ES, that acquire feature values for inducing a classification model based on an estimation of the expected improvement in model accuracy per unit cost. A comparison of the two policies to each other and to alternative policies demonstrate that Sampled Expected Utility is preferable as it effectively reduces the cost of producing a model of a desired accuracy and exhibits a consistent performance across domains.

[1]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[2]  Ian Witten,et al.  Data Mining , 2000 .

[3]  Daphne Koller,et al.  Active Learning for Parameter Estimation in Bayesian Networks , 2000, NIPS.

[4]  Ming Tan,et al.  Two Case Studies in Cost-Sensitive Concept Acquisition , 1990, AAAI.

[5]  Russell Greiner,et al.  Budgeted learning of nailve-bayes classifiers , 2002, UAI 2002.

[6]  Michael Lindenbaum,et al.  Selective Sampling for Nearest Neighbor Classifiers , 1999, Machine Learning.

[7]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[8]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[9]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[10]  Russell Greiner,et al.  Budgeted Learning of Naive-Bayes Classifiers , 2003, UAI.

[11]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[12]  Dan Roth,et al.  Learning cost-sensitive active classifiers , 2002, Artif. Intell..

[13]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[14]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[15]  H. Toutenburg Little, R.J.A. and D.B. Rubin:Statistical analysis with missing data , 1991 .

[16]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[17]  Foster J. Provost,et al.  Active feature-value acquisition for classifier induction , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[18]  H. Sebastian Seung,et al.  Selective Sampling Using the Query by Committee Algorithm , 1997, Machine Learning.

[19]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[20]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[21]  Zhiqiang Zheng,et al.  On active learning for data acquisition , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[22]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.