Bank loan analysis using customer usage data: A big data approach using Hadoop

As of now, currently there is a tremendous rise in the economy development due to which there has been a huge rise in the requirement of the personal loan of customers as the behavior of the borrowers have uncertainty and fuzzy nature. For both lenders and borrowers, credit risk is a major challenge, which directly or indirectly affects the reliability of the banks. Present article has concentrated on menace by granting loans to the customers, risk related to the investors. The objective of this paper is to analyze the credit risk and loan performance of the “Lending Club” company which is one of the biggest market place for online credit. Analyses of the performance of the bank loan and credit risk on the large dataset having 112 attributes which have been collected from the Lending Club of the period 2012 and 2016. In this paper, Hadoop approach has been used and for applying Hadoop methodology we will be using the Cloudera software which is an open source platform for analyzing the data. It supports the Hadoop ecosystem which is used for the managing, storing and analyzing the large volume of data. In this article, we used the Hive which is data warehouse system and which is used for managing and analyzing the data stored in HDFS (Hadoop Distributed File System) using HiveQL. To understand the performance of the bank loan data we had performed various analyses on the collected dataset of the bank.

[1]  E. Rengifo,et al.  What Determines Return Risks for Bank Equities in Turkey? , 2013 .

[2]  Hong Liu,et al.  Research of SVM Applying in the Risk of Bank's Loan to Enterprises , 2010, 2010 2nd International Conference on Information Engineering and Computer Science.

[3]  S. Natarajan,et al.  Credit Risk Analysis in Peer-to-Peer Lending System , 2016, 2016 IEEE International Conference on Knowledge Engineering and Applications (ICKEA).

[4]  Akash Dutta,et al.  Machine Learning on imbalanced data in Credit Risk , 2016, 2016 IEEE 7th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON).

[5]  Chen Xiaojie,et al.  SCAD algorithm and its application in analyse of bank loan↑ , 2009, 2009 4th International Conference on Computer Science & Education.

[6]  Qiao-ling Chen,et al.  Integrating of business intelligence and CRM in banks: An empirical study of SOM applied in personal customer loans in Taiwan , 2015, 2015 International Conference on Fuzzy Theory and Its Applications (iFUZZY).

[7]  Yizhe Dong,et al.  Evaluating the Performance of Chinese Commercial Banks: A Comparative Analysis of Different Types of Banks , 2015, Eur. J. Oper. Res..

[8]  Xu-dong Lin,et al.  Research on measuring the bank's loan willingness based on prospect theory , 2015, 2015 12th International Conference on Service Systems and Service Management (ICSSSM).