Extracting Actionable Knowledge from Decision Trees

Most data mining algorithms and tools stop at discovered customer models, producing distribution information on customer profiles. Such techniques, when applied to industrial problems such as customer relationship management (CRM), are useful in pointing out customers who are likely attritors and customers who are loyal, but they require human experts to postprocess the discovered knowledge manually. Most of the postprocessing techniques have been limited to producing visualization results and interestingness ranking, but they do not directly suggest actions that would lead to an increase in the objective function such as profit. In this paper, we present novel algorithms that suggest actions to change customers from an undesired status (such as attritors) to a desired one (such as loyal) while maximizing an objective function: the expected net profit. These algorithms can discover cost-effective actions to transform customers from undesirable classes to desirable ones. The approach we take integrates data mining and decision making tightly by formulating the decision making problems directly on top of the data mining results in a postprocessing step. To improve the effectiveness of the approach, we also present an ensemble of decision trees which is shown to be more robust when the training data changes. Empirical tests are conducted on both a realistic insurance application domain and UCI benchmark data

[1]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[2]  Hans-Peter Kriegel,et al.  Visualization Techniques for Mining Large Databases: A Comparison , 1996, IEEE Trans. Knowl. Data Eng..

[3]  Huiqing Liu,et al.  Ensembles of cascading trees , 2003, Third IEEE International Conference on Data Mining.

[4]  Barton Goldenberg CRM Automation , 2002 .

[5]  Naoki Abe,et al.  Sequential cost-sensitive decision making with reinforcement learning , 2002, KDD.

[6]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.

[7]  Ke Wang,et al.  Mining patterns that respond to actions , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Wayne S. DeSarbo,et al.  CRISP: Customer response based iterative segmentation procedures for response modeling in direct marketing , 1994 .

[9]  J. Dyché The Crm Handbook: A Business Guide to Customer Relationship Management , 2001 .

[10]  Ke Wang,et al.  Mining Customer Value: From Association Rules to Direct Marketing , 2005, Data Mining and Knowledge Discovery.

[11]  J. R. Bult,et al.  Optimal Selection for Direct Mail , 1995 .

[12]  Carla E. Brodley,et al.  Boosting Lazy Decision Trees , 2003, ICML.

[13]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[14]  Wynne Hsu,et al.  Finding Interesting Patterns Using User Expectations , 1999, IEEE Trans. Knowl. Data Eng..

[15]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[16]  Pedro M. Domingos MetaCost: a general method for making classifiers cost-sensitive , 1999, KDD '99.

[17]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[18]  Qiang Yang,et al.  Mining optimal actions for profitable CRM , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[19]  Sally Dibb,et al.  The Marketing Planning Workbook , 1996 .

[20]  Haixun Wang,et al.  Empirical comparison of various reinforcement learning strategies for sequential targeted marketing , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[21]  Wei Tang,et al.  Ensembling neural networks: Many could be better than all , 2002, Artif. Intell..

[22]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[23]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[24]  Vikas Sindhwani,et al.  On Manifold Regularization , 2005, AISTATS.

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Bianca Zadrozny,et al.  Learning and making decisions when costs and probabilities are both unknown , 2001, KDD '01.

[27]  Gregory Piatetsky-Shapiro,et al.  A Comparison of Approaches for Maximizing Business Payoff of Prediction Models , 1996, KDD.

[28]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[29]  Alex Berson,et al.  Building Data Mining Applications for CRM , 1999 .

[30]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[31]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[32]  Ronald G. Drozdenko,et al.  Optimal Database Marketing: Strategy, Development, and Data Mining , 2002 .

[33]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[34]  Tom M. Mitchell,et al.  Machine Learning and Data Mining , 2012 .

[35]  Charles Elkan,et al.  The Foundations of Cost-Sensitive Learning , 2001, IJCAI.

[36]  Charles X. Ling,et al.  Using AUC and accuracy in evaluating learning algorithms , 2005, IEEE Transactions on Knowledge and Data Engineering.

[37]  Qiang Yang,et al.  Postprocessing decision trees to extract actionable knowledge , 2003, Third IEEE International Conference on Data Mining.

[38]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[39]  Salvatore J. Stolfo,et al.  AdaCost: Misclassification Cost-Sensitive Boosting , 1999, ICML.

[40]  Nissan Levin,et al.  Segmentation analysis with managerial judgment , 1996 .

[41]  Susana V. Mondschein,et al.  Mailing Decisions in the Catalog Sales Industry , 1996 .

[42]  Ke Wang,et al.  Mining Actionable Patterns by Role Models , 2006, 22nd International Conference on Data Engineering (ICDE'06).