Postprocessing decision trees to extract actionable knowledge

Most data mining algorithms and tools stop at discovered customer models, producing distribution information on customer profiles. Such techniques, when applied to industrial problems such as customer relationship management (CRM), are useful in pointing out customers who are likely attritors and customers who are loyal, but they require human experts to postprocess the mined information manually. Most of the postprocessing techniques have been limited to producing visualization results and interestingness ranking, but they do not directly suggest actions that would lead to an increase the objective function such as profit. Here, we present a novel algorithm that suggest actions to change customers from an undesired status (such as attritors) to a desired one (such as loyal) while maximizing objective function: the expected net profit. We develop these algorithms under resource constraints that are abound in reality. The contribution of the work is in taking the output from an existing mature technique (decision trees, for example), and producing novel, actionable knowledge through automatic postprocessing.

[1]  Charles X. Ling,et al.  Data Mining for Direct Marketing: Problems and Solutions , 1998, KDD.

[2]  Jun Gu,et al.  Parallel algorithms and architectures for very fast AI search , 1991 .

[3]  Peter C. Cheeseman,et al.  Where the Really Hard Problems Are , 1991, IJCAI.

[4]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[5]  Charles X. Ling,et al.  Toward Bayesian Classifiers with Accurate Probabilities , 2002, PAKDD.

[6]  Qiang Yang,et al.  Mining optimal actions for profitable CRM , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[7]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[8]  Dorit S. Hochbaum,et al.  Approximation Algorithms for NP-Hard Problems , 1996 .

[9]  Wynne Hsu,et al.  Finding Interesting Patterns Using User Expectations , 1999, IEEE Trans. Knowl. Data Eng..

[10]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[11]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[12]  Qiang Yang,et al.  Case Mining for Action Recommendations , 2002 .

[13]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[14]  Ramasamy Uthurusamy,et al.  Data mining and knowledge discovery in databases , 1996, CACM.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Hans-Peter Kriegel,et al.  Issues in visualizing large databases , 1997 .

[17]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[18]  Hans-Peter Kriegel,et al.  Visualization Techniques for Mining Large Databases: A Comparison , 1996, IEEE Trans. Knowl. Data Eng..