Knowledge-Rich Data Mining in Financial Risk Detection

Financial risks refer to risks associated with financing, such as credit risk, business risk, debt risk and insurance risk, and these risks may put firms in distress. Early detection of financial risks can help credit grantors to reduce risk and losses, establish appropriate policies for different credit products and increase revenue. As the size of financial databases increases, large-scale data mining techniques that can process and analyze massive amounts of electronic data in a timely manner become a key component of many financial risk detection strategies and continue to be a subject of active research. However, the knowledge gap between the results data mining methods can provide and actions can be taken based on them remains large in financial risk detection. The goal of this research is to bring the concept of chance discovery into financial risk detection to build the knowledge-rich data mining process and therefore increase the usefulness of data mining results in financial risk detection. Using six financial risk related datasets, this research illustrates that the combination of data mining techniques and chance discovery can provide knowledge-rich data mining results to decision makers; promote the awareness of previously unnoticed chances; and increase the actionability of data mining results.

[1]  Pedro M. Domingos Toward knowledge-rich data mining , 2007, Data Mining and Knowledge Discovery.

[2]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[3]  Jack Dongarra,et al.  Computational Science - ICCS 2007, 7th International Conference, Beijing, China, May 27 - 30, 2007, Proceedings, Part III , 2007, ICCS.

[4]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[5]  Salvatore J. Stolfo,et al.  Distributed data mining in credit card fraud detection , 1999, IEEE Intell. Syst..

[6]  Deepak Khazanchi,et al.  Application of Classification Methods to Individual Disability Income Insurance Fraud Detection , 2007, International Conference on Computational Science.

[7]  Yi Peng,et al.  Discovering Credit Cardholders’ Behavior by Multiple Criteria Linear Programming , 2005, Ann. Oper. Res..

[8]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[11]  Jack Dongarra,et al.  Computational Science - ICCS 2005, 5th International Conference, Atlanta, GA, USA, May 22-25, 2005, Proceedings, Part I , 2005, International Conference on Computational Science.

[12]  Casimir A. Kulikowski,et al.  Computer Systems That Learn: Classification and Prediction Methods from Statistics, Neural Nets, Machine Learning and Expert Systems , 1990 .

[13]  Gang Kou,et al.  Bankruptcy prediction for Japanese firms: using Multiple Criteria Linear Programming data mining approach , 2006, Int. J. Bus. Intell. Data Min..

[14]  Gregory Piatetsky-Shapiro,et al.  The KDD process for extracting useful knowledge from volumes of data , 1996, CACM.

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[16]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[17]  Zhengxin Chen,et al.  Improving Clustering Analysis for Credit Card Accounts Classification , 2005, International Conference on Computational Science.

[18]  Y. Ohsawa,et al.  Chance Discovery By Stimulated Groups Of People. Application To Understanding Consumption Of Rare Food , 2002 .

[19]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[20]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[21]  Ian Witten,et al.  Data Mining , 2000 .

[23]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[24]  Zhengxin Chen,et al.  A Multi-criteria Convex Quadratic Programming model for credit data analysis , 2008, Decis. Support Syst..