A Multi-criteria Convex Quadratic Programming model for credit data analysis

Speed and scalability are two essential issues in data mining and knowledge discovery. This paper proposed a mathematical programming model that addresses these two issues and applied the model to Credit Classification Problems. The proposed Multi-criteria Convex Quadric Programming (MCQP) model is highly efficient (computing time complexity O(n^1^.^5^-^2)) and scalable to massive problems (size of O(10^9)) because it only needs to solve linear equations to find the global optimal solution. Kernel functions were introduced to the model to solve nonlinear problems. In addition, the theoretical relationship between the proposed MCQP model and SVM was discussed.

[1]  Simon Haykin,et al.  Generalized support vector machines , 1999, ESANN.

[2]  J. Showers,et al.  Reducing Uncollectible Revenue from Residential Telephone Customers , 1981 .

[3]  Hj Norussis,et al.  SPSS for Windows , 1993 .

[4]  Peter Kolesar,et al.  A Robust Credit Screening Model Using Categorical Data , 1985 .

[5]  P. Wolfe A duality theorem for non-linear programming , 1961 .

[6]  Olvi L. Mangasarian,et al.  Support vector machine classification via parameterless robust linear programming , 2005, Optim. Methods Softw..

[7]  David Haussler,et al.  Proceedings of the fifth annual workshop on Computational learning theory , 1992, COLT 1992.

[8]  V. Vapnik,et al.  Bounds on Error Expectation for Support Vector Machines , 2000, Neural Computation.

[9]  Jude W. Shavlik,et al.  Knowledge-Based Kernel Approximation , 2004, J. Mach. Learn. Res..

[10]  Eric Rosenberg,et al.  Quantitative Methods in Credit Management: A Survey , 1994, Oper. Res..

[11]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[12]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[13]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[14]  Gang Kou,et al.  Bankruptcy prediction for Japanese firms: using Multiple Criteria Linear Programming data mining approach , 2006, Int. J. Bus. Intell. Data Min..

[15]  Yong Shi,et al.  Multiple criteria and multiple constraint levels linear programming : concepts, techniques and applications , 2001 .

[16]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[17]  Thorsten Joachims,et al.  SVM Light: Support Vector Machine , 2002 .

[18]  Bernhard Schölkopf,et al.  Incorporating Invariances in Support Vector Learning Machines , 1996, ICANN.

[19]  S. Sinha A Duality Theorem for Nonlinear Programming , 1966 .

[20]  Gang Kou,et al.  Classification of HIV-I-Mediated neuronal dendritic and synaptic damage using multiple criteria linear programming , 2007, Neuroinformatics.

[21]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[22]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[23]  Zhengxin Chen,et al.  Improving Clustering Analysis for Credit Card Accounts Classification , 2005, International Conference on Computational Science.

[24]  Alexander J. Smola,et al.  Advances in Large Margin Classifiers , 2000 .

[25]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[26]  David J. Hand,et al.  Discrimination and Classification , 1982 .

[27]  Rich Caruana,et al.  Data mining in metric space: an empirical analysis of supervised learning performance criteria , 2004, ROCAI.

[28]  Huimin Zhao,et al.  A multi-objective genetic programming approach to developing Pareto optimal decision trees , 2007, Decis. Support Syst..

[29]  David R. Musicant,et al.  Successive overrelaxation for support vector machines , 1999, IEEE Trans. Neural Networks.

[30]  Alexander J. Smola,et al.  Minimal Kernel Classifiers , 2002, J. Mach. Learn. Res..

[31]  V. Vapnik Estimation of Dependences Based on Empirical Data , 2006 .

[32]  O. Mangasarian Linear and Nonlinear Separation of Patterns by Linear Programming , 1965 .

[33]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[34]  Olvi L. Mangasarian,et al.  Multisurface method of pattern separation , 1968, IEEE Trans. Inf. Theory.

[35]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[36]  Yi Peng,et al.  Discovering Credit Cardholders’ Behavior by Multiple Criteria Linear Programming , 2005, Ann. Oper. Res..

[37]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[38]  David L. Olson,et al.  Introduction to Business Data Mining , 2005 .

[39]  Zhengxin Chen,et al.  Cross-Validation and Ensemble Analyses on Multiple-Criteria Linear Programming Classification for Credit Cardholder Behavior , 2004, International Conference on Computational Science.

[40]  V. Vapnik Pattern recognition using generalized portrait method , 1963 .

[41]  Gene H. Golub,et al.  Matrix computations , 1983 .

[43]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[44]  F. Glover,et al.  Simple but powerful goal programming models for discriminant problems , 1981 .

[45]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[46]  Glenn Fung,et al.  Data selection for support vector machine classifiers , 2000, KDD '00.

[47]  Zhengxin Chen,et al.  Classifying Credit Card Accounts for Business Intelligence and Decision Making: a Multiple-criteria Quadratic Programming Approach , 2005, Int. J. Inf. Technol. Decis. Mak..

[48]  Zhengxin Chen,et al.  A New Multi-criteria Convex Quadratic Programming Model for Credit Analysis , 2006, International Conference on Computational Science.

[49]  V. Vapnik,et al.  A note one class of perceptrons , 1964 .

[50]  Johan A. K. Suykens,et al.  A process model to develop an internal rating system: Sovereign credit ratings , 2006, Decis. Support Syst..

[51]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[52]  Olvi L. Mangasarian,et al.  Machine learning and data mining via mathematical programming-based support vector machines , 2003 .

[53]  Paul S. Bradley,et al.  Mathematical Programming for Data Mining: Formulations and Challenges , 1999, INFORMS J. Comput..

[54]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory, Second Edition , 2000, Statistics for Engineering and Information Science.

[55]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.