Ensemble Incremental Learning Iterative Mechanism on Reject Inference

This paper investigates the possibility of using statistical methods (extrapolation and NB) and supervised learning algorithms (LightGBM, xgboost, RF, LR, catboost) to build a benchmark model through the Ensemble Learning (using Weighted Voting) to predict the default status of rejected samples and therefore obtain full sample data set. One challenge of this research is proposing a new training sample selection process, which requires an effective mechanism for rejecting the sample inclusion ratio and multiple rounds of iterative verification based on AUC. Based on the empirical analysis of the data sets of Xunmiao and Lendingclub, the results show that: reject inference can improve the prediction accuracy of the personal credit scoring model, and the improvement of accuracy is mainly due to the improvement of the correctness of the default customer.