A Trial of Student Self-Sponsored Peer-to-Peer Lending Based on Credit Evaluation Using Big Data Analysis

There is still no effective approach to overcome the problem of credit evaluation for Chinese students. In absence of a reliable credit evaluation system for students, the university students have to only apply through online peer-to-peer (P2P) loan platforms because Chinese financial institutions typically reject students' loan applications. Lack of students' financial records hinders financial institutes and banks to routinely evaluate the students' credit status and assign loans to them. Hence, this paper attempted to benefit from university students' diversified daily behavior data, and logistic regression (LR) and gradient boosting decision tree (GBDT) algorithms were also used to develop robust credit evaluation models for university students, in which the validation of the proposed models was assessed by a real-time P2P lending platform. In this study, the students' overdue behavior in returning books to university library was used as an index. With training 17838 samples, the proposed models performed well, while GBDT-based model outperformed in identification of “bad borrowers.” Based on the proposed models, a self-sponsored peer-to-peer loan platform was established and developed in a Chinese university for ten months, and the achieved findings demonstrated that adopting such credit evaluation models can effectively reduce the default ratio.

[1]  M. Pagano,et al.  Information Sharing in Credit Markets , 1993 .

[2]  Bor-Wen Cheng,et al.  Prediction model building with clustering-launched classification and support vector machines in credit scoring , 2009, Expert Syst. Appl..

[3]  Germano C. Vasconcelos,et al.  Neural Networks vs Logistic Regression: a Comparative Study on a Large Data Set , 2004, ICPR.

[4]  Martin Summer,et al.  A Systematic Approach to Multi-Period Stress Testing of Portfolio Credit Risk , 2010 .

[5]  Tian-Shyug Lee,et al.  Mining the customer credit using classification and regression tree and multivariate adaptive regression splines , 2006, Comput. Stat. Data Anal..

[6]  Robert Hampshire,et al.  Sending mixed signals: multilevel reputation effects in peer-to-peer lending markets , 2010, CSCW '10.

[7]  Ahmad Umar Abdullahi USING NEURAL NETWORKS FOR CREDIT SCORING , 2015 .

[8]  Xindong Wu,et al.  Knowledge Engineering with Big Data , 2015, IEEE Intell. Syst..

[9]  J. Friedman Stochastic gradient boosting , 2002 .

[10]  Bart Baesens,et al.  Forecasting Loss Given Default models: impact of account characteristics and the macroeconomic state , 2014, J. Oper. Res. Soc..

[11]  Jonathan Crook,et al.  Credit Scoring Models in the Credit Union Environment Using Neural Networks and Genetic Algorithms , 1997 .

[12]  Bernard De Baets,et al.  ROC analysis in ordinal regression learning , 2008, Pattern Recognit. Lett..

[13]  Li-Chiu Chi,et al.  Predicting multilateral trade credit risks: comparisons of Logit and Fuzzy Logic models using ROC curve analysis , 2005, Expert Syst. Appl..

[14]  Josep-Oriol Escardíbul,et al.  Peer effects on youth screen media consumption in Catalonia (Spain) , 2013 .

[15]  Jigui Jian,et al.  Stability, bifurcation and a new chaos in the logistic differential equation with delay , 2006 .

[16]  Xiaobo He,et al.  Peer Effects on Childhood and Adolescent Obesity in China , 2015, SSRN Electronic Journal.

[17]  Shanjun Li,et al.  HOW IMPORTANT ARE ENDOGENOUS PEER EFFECTS IN GROUP LENDING? ESTIMATING A STATIC GAME OF INCOMPLETE INFORMATION: GROUP LENDING AND PEER EFFECTS , 2013 .

[18]  Sinan Aral,et al.  Exercise contagion in a global social network , 2017, Nature Communications.

[19]  Lucila Ohno-Machado,et al.  Logistic regression and artificial neural network classification models: a methodology review , 2002, J. Biomed. Informatics.

[20]  Dongyu Chen,et al.  A trust model for online peer-to-peer lending: a lender’s perspective , 2014, Inf. Technol. Manag..

[21]  José Salvador Sánchez,et al.  Two-level classifier ensembles for credit risk assessment , 2012, Expert Syst. Appl..

[22]  Damon Centola,et al.  The Spread of Behavior in an Online Social Network Experiment , 2010, Science.

[23]  Yanyan Liu,et al.  How Important Are Endogenous Peer Effects in Group Lending? Estimating a Static Game of Incomplete Information , 2009 .

[24]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[25]  Yu Jin,et al.  A Data-Driven Approach to Predict Default Risk of Loan for Online Peer-to-Peer (P2P) Lending , 2015, 2015 Fifth International Conference on Communication Systems and Network Technologies.

[26]  Xu Lin,et al.  Identifying Peer Effects in Student Academic Achievement by Spatial Autoregressive Models with Group Unobservables , 2010, Journal of Labor Economics.

[27]  Christophe Mues,et al.  An experimental comparison of classification algorithms for imbalanced credit scoring data sets , 2012, Expert Syst. Appl..

[28]  Mohsen Sharifi,et al.  A Dynamic Popularity-Aware Load Balancing Algorithm for Structured P2P Systems , 2012, NPC.

[29]  Jae Kwon Bae,et al.  Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data , 2015, Expert Syst. Appl..

[30]  Joseph Feller,et al.  Information sharing and user behavior in internet-enabled peer-to-peer lending systems: an empirical study , 2017, J. Inf. Technol..

[31]  Jidong Chen,et al.  Big data based fraud risk management at Alibaba , 2015 .

[32]  Jaron Shalom Rottman-Yang Related Work 2 . 1 The Spread of Behavior in an Online Social Network Experiment , 2017 .

[33]  Gregor N. F. Weiß,et al.  Mitigating Adverse Selection in P2P Lending – Empirical Evidence from Prosper.com , 2010 .

[34]  Lin Jun Perspection on the Risks and Management of Credit Card Consumption among University Students , 2011 .

[35]  Yufei Xia,et al.  A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring , 2017, Expert Syst. Appl..

[36]  Yue Zhou,et al.  A hybrid semi-supervised approach for financial fraud detection , 2017, 2017 International Conference on Machine Learning and Cybernetics (ICMLC).

[37]  Kjersti Aas,et al.  Predicting mortgage default using convolutional neural networks , 2018, Expert Syst. Appl..

[38]  Shashi Dahiya,et al.  A feature selection enabled hybrid‐bagging algorithm for credit risk evaluation , 2017, Expert Syst. J. Knowl. Eng..

[39]  Toni Mora,et al.  Peer effects in adolescent BMI: evidence from Spain. , 2013, Health economics.

[40]  Kin Keung Lai,et al.  AdaBoost Models for Corporate Bankruptcy Prediction with Missing Data , 2016, Computational Economics.

[41]  Selwyn Piramuthu,et al.  Financial credit-risk evaluation with neural and neurofuzzy systems , 1999, Eur. J. Oper. Res..

[42]  Nir Kshetri,et al.  Big data's role in expanding access to financial services in China , 2016, Int. J. Inf. Manag..

[43]  Chiun-Chieh Hsu,et al.  A hybrid approach to integrate genetic algorithm into dual scoring model in enhancing the performance of credit scoring model , 2012, Expert Syst. Appl..

[44]  N. Christakis,et al.  The Spread of Obesity in a Large Social Network Over 32 Years , 2007, The New England journal of medicine.

[45]  Dean Eckles,et al.  Estimating peer effects in networks with peer encouragement designs , 2016, Proceedings of the National Academy of Sciences.

[46]  C ONG,et al.  Building credit scoring models using genetic programming , 2005, Expert Syst. Appl..

[47]  S. Martin,et al.  A comparison of discriminant analysis and logistic regression for the prediction of coliform mastitis in dairy cows. , 1987, Canadian journal of veterinary research = Revue canadienne de recherche veterinaire.

[48]  Tian-Shyug Lee,et al.  A two-stage hybrid credit scoring model using artificial neural networks and multivariate adaptive regression splines , 2005, Expert Syst. Appl..

[49]  Chen Jia Big Data and Its Use in Credit Risk Management , 2015 .