A logistic regression method for cost sensetive active learning

Direct marketing involves offering a product or service to a carefully selected group of customers, the ones expected to render the most profits. Active learning is a data mining policy which actively selects unlabeled instances for labeling. In this research our goal is to construct a model that minimizes the net acquisition cost of selection of instances for labeling and at the same time maximizes the net profit gained from approaching selected customers. We present a new framework which combines a cost-sensitive active learning algorithm with a logistic regression classifier. We evaluated the framework on two benchmark datasets. The results appear encouraging.

[1]  Andrew McCallum,et al.  Toward Optimal Active Learning through Sampling Estimation of Error Reduction , 2001, ICML.

[2]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[3]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2000, J. Mach. Learn. Res..

[4]  Andrew W. Moore,et al.  Making logistic regression a core data mining tool with TR-IRLS , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[5]  Nissan Levin,et al.  Data Mining for Target Marketing , 2010, Data Mining and Knowledge Discovery Handbook.

[6]  Daphne Koller,et al.  Support Vector Machine Active Learning with Application sto Text Classification , 2000, ICML.

[7]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[8]  Masanobu Taniguchi,et al.  Input dependent misclassification costs for cost-sensitive classifiers , 2000 .

[9]  Foster J. Provost,et al.  Decision-Centric Active Learning of Binary-Outcome Models , 2007, Inf. Syst. Res..

[10]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[11]  Dragos D. Margineantu,et al.  Active Cost-Sensitive Learning , 2005, IJCAI.

[12]  Lyle H. Ungar,et al.  Machine Learning manuscript No. (will be inserted by the editor) Active Learning for Logistic Regression: , 2007 .

[13]  Lior Rokach,et al.  Pessimistic cost-sensitive active learning of decision trees for profit maximizing targeting campaigns , 2008, Data Mining and Knowledge Discovery.

[14]  Peter D. Turney Types of Cost in Inductive Concept Learning , 2002, ArXiv.

[15]  Saharon Rosset,et al.  KDD-cup 99: knowledge discovery in a charitable organization's donor database , 2000, SKDD.