Application of Genetic Programming in credit scoring

Derived characteristics are usually regarded as important index in credit scoring, however, only some derived characteristics in common sense can be obtained with analytical methods. In this paper, the selection of derived characteristics is considered as a combinatorial optimization problem of mathematical symbols and original characteristics. To solve the problem, a Genetic Programming algorithm is proposed, where the coding structure is in a tree form and the objective is expressed by Information Value (IV). A procedure of human-computer interactions are designed to choose the derived characteristics with practical significance from the better results obtained by the Genetic Programming algorithm. Furthermore, an improved model is proposed based on linear discriminate analysis, where derived characteristics are included The simulation experiments show that the results is satisfactory, the proposed models are of competitive discrimination.