Multiple criteria linear programming approach to data mining: Models, algorithm designs and software development

It is well known that data mining has been implemented by statistical regressions, induction decision tree, neural networks, rough set, fuzzy set and etc. This paper promotes a multiple criteria linear programming (MCLP) approach to data mining based on linear discriminant analysis. This paper first describes the fundamental connections between MCLP and data mining, including several general models of MCLP approaches. Given the general models, it focuses on a designing architecture of MCLP-data mining algorithms in terms of a process of real-life business intelligence. This architecture consists of finding MCLP solutions, preparing mining scores, and interpreting the knowledge patterns. Secondly, this paper elaborates the software development of the MCLP-data mining algorithms. Based on a pseudo coding, two versions of software (SAS- and Linux-platform) will be discussed. Finally, the software performance analysis over business and experimental databases is reported to show its mining and prediction power. As a part of the performance analysis, a series of data testing comparisons between the MCLP and induction decision tree approaches are demonstrated. These findings suggest that the MCLP-data mining techniques have a great potential in discovering knowledge patterns from a large-scale real-life database or data warehouse.

[1]  David J. Groggel,et al.  Practical Nonparametric Statistics , 2000, Technometrics.

[2]  Gang Kou,et al.  Classifications of neural dendritic and synaptic damage resulting from HIV-1-associated dementia: a multiple criteria linear programming approach , 2003, 36th Annual Hawaii International Conference on System Sciences, 2003. Proceedings of the.

[3]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[4]  Yong Shi,et al.  Multiple criteria and multiple constraint levels linear programming : concepts, techniques and applications , 2001 .

[5]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[6]  M. Stone Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .

[7]  Yong Shi,et al.  A fuzzy programming approach for solving a multiple criteria and multiple constraint level linear programming problem , 1994 .

[8]  林 雅人 Red Hat Linux 7入門キット , 2001 .

[9]  Karim K. Hirji,et al.  Discovering data mining: from concept to implementation , 1999, SKDD.

[10]  Stanley Zionts,et al.  Multiple Criteria Decision Making and Risk Analysis Using MicroComputers , 1989 .

[11]  G. A. Miller THE PSYCHOLOGICAL REVIEW THE MAGICAL NUMBER SEVEN, PLUS OR MINUS TWO: SOME LIMITS ON OUR CAPACITY FOR PROCESSING INFORMATION 1 , 1956 .

[12]  Yi Peng,et al.  Discovering Credit Cardholders’ Behavior by Multiple Criteria Linear Programming , 2005, Ann. Oper. Res..

[13]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[14]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[15]  Po-Lung Yu,et al.  Goal Setting and Compromise Solutions , 1989 .

[16]  Yong Shi,et al.  Data Mining in Credit Card Portfolio Management: A Multiple Criteria Decision Making Approach , 2001 .

[17]  Yi Peng,et al.  Data Mining via Multiple Criteria Linear Programming: Applications in Credit Card Portfolio Management , 2002, Int. J. Inf. Technol. Decis. Mak..

[18]  Saul B. Gelfand,et al.  Classification trees with neural network feature extraction , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  P. Yu,et al.  The set of all nondominated solutions in linear cases and a multicriteria simplex method , 1975 .

[20]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[21]  F. Glover,et al.  Simple but powerful goal programming models for discriminant problems , 1981 .

[22]  Nesa L'abbe Wu,et al.  Linear programming and extensions , 1981 .

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.