Extracting optimal actionable plans from additive tree models

Although amazing progress has been made in machine learning to achieve high generalization accuracy and efficiency, there is still very limited work on deriving meaningful decision-making actions from the resulting models. However, in many applications such as advertisement, recommendation systems, social networks, customer relationship management, and clinical prediction, the users need not only accurate prediction, but also suggestions on actions to achieve a desirable goal (e.g., high ads hit rates) or avert an undesirable predicted result (e.g., clinical deterioration). Existing works for extracting such actionability are few and limited to simple models such as a decision tree. The dilemma is that those models with high accuracy are often more complex and harder to extract actionability from.In this paper, we propose an effective method to extract actionable knowledge from additive tree models (ATMs), one of the most widely used and best off-the-shelf classifiers. We rigorously formulate the optimal actionable planning (OAP) problem for a given ATM, which is to extract an actionable plan for a given input so that it can achieve a desirable output while maximizing the net profit. Based on a state space graph formulation, we first propose an optimal heuristic search method which intends to find an optimal solution. Then, we also present a sub-optimal heuristic search with an admissible and consistent heuristic function which can remarkably improve the efficiency of the algorithm. Our experimental results demonstrate the effectiveness and efficiency of the proposed algorithms on several real datasets in the application domain of personal credit and banking.

[1]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[2]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[3]  Shaowei Cai,et al.  Balance between Complexity and Quality: Local Search for Minimum Vertex Cover in Massive Graphs , 2015, IJCAI.

[4]  Paulo Cortez,et al.  Using sensitivity analysis and visualization techniques to open black box data mining models , 2013, Inf. Sci..

[5]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Qiang Yang,et al.  Extracting Actionable Knowledge from Decision Trees , 2007, IEEE Transactions on Knowledge and Data Engineering.

[8]  Kilian Q. Weinberger,et al.  Web-Search Ranking with Initialized Gradient Boosted Regression Trees , 2010, Yahoo! Learning to Rank Challenge.

[9]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[10]  Howard J. Hamilton,et al.  Applying Objective Interestingness Measures in Data Mining Systems , 2000, PKDD.

[11]  Nissan Levin,et al.  Segmentation analysis with managerial judgment , 1996 .

[12]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[13]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[14]  Qiang Yang,et al.  Postprocessing decision trees to extract actionable knowledge , 2003, Third IEEE International Conference on Data Mining.

[15]  Longbing Cao,et al.  Actionable Knowledge Discovery , 2009 .

[16]  David Taniar,et al.  Domain-Driven, Actionable Knowledge Discovery , 2007, IEEE Intelligent Systems.

[17]  Chengqi Zhang,et al.  Knowledge actionability: satisfying technical and business interestingness , 2007, Int. J. Bus. Intell. Data Min..

[18]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[19]  Yixin Chen,et al.  Optimal Action Extraction for Random Forests and Boosted Trees , 2015, KDD.

[20]  Peter Norvig,et al.  Artificial intelligence - a modern approach, 2nd Edition , 2003, Prentice Hall series in artificial intelligence.

[21]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[22]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[23]  Yixin Chen,et al.  An integrated data mining approach to real-time clinical monitoring and deterioration warning , 2012, KDD.

[24]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[25]  Wynne Hsu,et al.  Post-Analysis of Learned Rules , 1996, AAAI/IAAI, Vol. 1.

[26]  Wayne S. DeSarbo,et al.  CRISP: Customer response based iterative segmentation procedures for response modeling in direct marketing , 1994 .

[27]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[28]  Zhi-Hua Zhou,et al.  NeC4.5: Neural Ensemble Based C4.5 , 2004, IEEE Trans. Knowl. Data Eng..

[29]  Chengqi Zhang,et al.  Flexible Frameworks for Actionable Knowledge Discovery , 2010, IEEE Transactions on Knowledge and Data Engineering.

[30]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[31]  Kevin M. Heard,et al.  A trial of a real-time alert for clinical deterioration in patients hospitalized on general medical wards. , 2013, Journal of hospital medicine.