Multiple instance learning for credit risk assessment with transaction data

Abstract As the number of personal loan applications grows rapidly, credit risk assessment has become increasingly crucial to both practitioners and researchers. In a traditional assessment system, individual socio-demographic information and loan application information are designed as input for feature engineering; however, an applicant's dynamic transaction history, which is in fact an important indicator for the applicant's pay back behavior, is not included. The present study proposes a comprehensive assessment method that incorporates both conventional data, such as individual socio-demographic information and loan application information, and data for the applicant's dynamic transaction behavior. Our method is based on Radial Basis Function (RBF) Multiple Instance Learning (MIL), which extracts features from a person's transaction behavior history. Five real-world datasets from two large commercial banks in China are used to validate the effectiveness of our proposed method. The experimental results show that our method remarkably improves the prediction performance by using the most commonly used model evaluation criteria.

[1]  Jun Wang,et al.  Solving the Multiple-Instance Problem: A Lazy Learning Approach , 2000, ICML.

[2]  Oded Maron,et al.  Multiple-Instance Learning for Natural Scene Classification , 1998, ICML.

[3]  Sheng-Tun Li,et al.  The evaluation of consumer loans using support vector machines , 2006, Expert Syst. Appl..

[4]  Jun Zhou,et al.  MILIS: Multiple Instance Learning with Instance Selection , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Byeong U. Park A Cross-Validatory Choice of Smoothing Parameter in Adaptive Location Estimation , 1993 .

[6]  Thomas G. Dietterich,et al.  Solving the Multiple Instance Problem with Axis-Parallel Rectangles , 1997, Artif. Intell..

[7]  Sebastián Maldonado,et al.  Cost-based feature selection for Support Vector Machines: An application in credit scoring , 2017, Eur. J. Oper. Res..

[8]  Furio Camillo,et al.  Personal values and credit scoring: new insights in the financial prediction , 2018, J. Oper. Res. Soc..

[9]  Terry Harris,et al.  Credit scoring using the clustered support vector machine , 2015, Expert Syst. Appl..

[10]  Shuai Zhang,et al.  A novel ensemble method for credit scoring: Adaption of different imbalance ratios , 2018, Expert Syst. Appl..

[11]  Jaume Amores,et al.  Multiple instance classification: Review, taxonomy and comparative study , 2013, Artif. Intell..

[12]  Zhi-Hua Zhou,et al.  On the relation between multi-instance learning and semi-supervised learning , 2007, ICML '07.

[13]  Jure Zupan,et al.  Consumer Credit Scoring Models with Limited Data , 2007, Expert Syst. Appl..

[14]  Tomás Lozano-Pérez,et al.  A Framework for Multiple-Instance Learning , 1997, NIPS.

[15]  Desheng Dash Wu,et al.  A deep learning approach for credit scoring using credit default swaps , 2017, Eng. Appl. Artif. Intell..

[16]  Brian Mac Namee,et al.  A window of opportunity: Assessing behavioural scoring , 2013, Expert Syst. Appl..

[17]  Nuno Vasconcelos,et al.  Multiple instance learning for soft bags via top instances , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  H NOH,et al.  Prognostic personal credit risk model considering censored information , 2005, Expert Syst. Appl..

[19]  R. Malhotra,et al.  Evaluating Consumer Loans using Neural Networks , 2003 .

[20]  Thomas Hofmann,et al.  Support Vector Machines for Multiple-Instance Learning , 2002, NIPS.

[21]  Mu-Chen Chen,et al.  Credit scoring and rejected instances reassigning through evolutionary computation techniques , 2003, Expert Syst. Appl..

[22]  William H. Press,et al.  Numerical Recipes in C The Art of Scientific Computing , 1995 .

[23]  João Gama,et al.  A new dynamic modeling framework for credit risk assessment , 2016, Expert Syst. Appl..

[24]  Yufei Xia,et al.  A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring , 2017, Expert Syst. Appl..

[25]  Bernhard Pfahringer,et al.  A Two-Level Learning Method for Generalized Multi-instance Problems , 2003, ECML.

[26]  Huimin Zhao,et al.  Incorporating domain knowledge into data mining classifiers: An application in indirect lending , 2008, Decis. Support Syst..

[27]  L. Thomas A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers , 2000 .

[28]  Kin Keung Lai,et al.  Least squares support vector machines ensemble models for credit scoring , 2010, Expert Syst. Appl..

[29]  Terry Harris,et al.  Quantitative credit risk assessment using support vector machines: Broad versus Narrow default definitions , 2013, Expert Syst. Appl..

[30]  R. Avery,et al.  Consumer Credit Scoring: Do Situational Circumstances Matter? , 2004 .

[31]  Zhi-Hua Zhou,et al.  Adapting RBF Neural Networks to Multi-Instance Learning , 2006, Neural Processing Letters.