A clustering and selection based transfer ensemble model for customer credit scoring

Customer credit scoring is an important concern for numerous domestic and global industries. It is difficult to achieve satisfactory performance by traditional models constructed on the assumption that the training and test data are subject to the same distribution, because the customers usually come from different districts and may be subject to different distributions in reality. This study combines ensemble learning with transfer learning, and proposes a clustering and selection based transfer ensemble (CSTS) model to transfer the instances from related source domains to target domain for assisting in modeling. The experimental results in two customer credit scoring datasets show that CSTE model outperforms two traditional credit scoring models, as well as three existing transfer learning models.

[1]  D. Durand Risk Elements in Consumer Instalment Financing, Technical Edition , 1941 .

[2]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Vijay S. Desai,et al.  A comparison of neural networks and linear scoring models in the credit union environment , 1996 .

[4]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[5]  Jiri Matas,et al.  On Combining Classifiers , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Alexey Tsymbal,et al.  Ensemble feature selection with the simple Bayesian classification , 2003, Inf. Fusion.

[7]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  J. Friedman On Multivariate Goodness-of-Fit and Two-Sample Testing , 2004 .

[10]  Thomas G. Dietterich,et al.  To transfer or not to transfer , 2005, NIPS 2005.

[11]  Zhou Zhihua,et al.  Bagging-Based Selective Clusterer Ensemble , 2005 .

[12]  Vasile Palade,et al.  Multi-Classifier Systems: Review and a roadmap for developers , 2006, Int. J. Hybrid Intell. Syst..

[13]  Konstantinos Falangis,et al.  The use of MSD model in credit scoring , 2007, Oper. Res..

[14]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[15]  Consumer credit scoring models with limited data , 2009, Expert Syst. Appl..

[16]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[17]  Shotaro Akaho,et al.  TrBagg: A Simple Transfer Learning Method and its Application to Personalization in Collaborative Tagging , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[18]  Xiaoyi Jiang,et al.  A dynamic classifier ensemble selection approach for noise data , 2010, Inf. Sci..

[19]  Zou Peng,et al.  Customer value segmentation based on cost-sensitive learning Support Vector Machine , 2010, Int. J. Serv. Technol. Manag..

[20]  Feng-Chia Li,et al.  Combination of feature selection approaches with SVM in credit scoring , 2010, Expert Syst. Appl..

[21]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[22]  Gianluca Antonini,et al.  Subagging for credit scoring models , 2010, Eur. J. Oper. Res..

[23]  Ivor W. Tsang,et al.  Domain Adaptation via Transfer Component Analysis , 2009, IEEE Transactions on Neural Networks.

[24]  Francisco Herrera,et al.  A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[25]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[26]  Xiaoyi Jiang,et al.  Dynamic classifier ensemble model for customer classification with imbalanced class distribution , 2012, Expert Syst. Appl..

[27]  Shouyang Wang,et al.  Feature-selection-based dynamic transfer ensemble model for customer churn prediction , 2013, Knowledge and Information Systems.

[28]  José Salvador Sánchez,et al.  On the suitability of resampling techniques for the class imbalance problem in credit scoring , 2013, J. Oper. Res. Soc..

[29]  Zhifang,et al.  CHARACTERISTICS OF INVESTORS' RISK PREFERENCE FOR STOCK MARKETS , 2014 .

[30]  Fenghua Wen,et al.  Robust CVaR-based portfolio optimization under a genal affine data perturbation uncertainty set , 2014 .

[31]  Peng Hao,et al.  Transfer learning using computational intelligence: A survey , 2015, Knowl. Based Syst..

[32]  J. Suykens,et al.  Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research , 2015, Eur. J. Oper. Res..