Applying a hybrid data-mining approach to prediction problems: a case of preferred suppliers prediction

Data mining has emerged as powerful tools to automate the decision-making processes. The deficiency of the traditional approach of data mining for prediction is that the low-accuracy decision rules that cannot be used and the quality rules, with high accuracy, do not guarantee that they will be used since the premise (condition) part of the rule should match with the input domain of the testing data sets. Furthermore, negative information/data, which are defined as information/data, misclassify the outcomes from the testing data set and are normally discarded. It is possible to reuse the information/data for training purposes and they contain a positive impact on prediction accuracy. This paper presents a data-mining-based hybrid approach that consists of a novel rough-set algorithm for feature selection and an enhanced multi-class support vector machines (SVMs) method for accurate prediction. The approach can simultaneously derive decision rules, identify the most significant features and generate a well-tuned prediction model with high accuracy. A case study of supplier selection of a video game system is validated by historical data and the results show the practical viability of the hybrid approach for predicting preferred suppliers.

[1]  M. C. Jothishankar,et al.  Quality control problem in printed circuit board manufacturing—An extended rough set theory approach , 2004 .

[2]  Andrew Kusiak,et al.  Data mining of printed-circuit board defects , 2001, IEEE Trans. Robotics Autom..

[3]  Hahn-Ming Lee,et al.  Multi-class SVM with negative data selection for Web page classification , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[4]  Thorsten Joachims,et al.  Learning to classify text using support vector machines - methods, theory and algorithms , 2002, The Kluwer international series in engineering and computer science.

[5]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[6]  C. Monge,et al.  Determining the importance of the supplier selection process in manufacturing: a case study , 2004 .

[7]  Andrew Kusiak,et al.  Feature transformation methods in data mining , 2001 .

[8]  Ramasamy Uthurusamy,et al.  EVOLVING DATA MINING INTO SOLUTIONS FOR INSIGHTS , 2002 .

[9]  Masaaki Kurosu,et al.  Effects of negative information on acquiring procedural knowledge , 2002, International Conference on Computers in Education, 2002. Proceedings..

[10]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[11]  Ramasamy Uthurusamy,et al.  Evolving data into mining solutions for insights , 2002, CACM.

[12]  Kristin P. Bennett,et al.  Multicategory Classification by Support Vector Machines , 1999, Comput. Optim. Appl..

[13]  Jason Weston,et al.  Multi-Class Support Vector Machines , 1998 .

[14]  Andrew Kusiak,et al.  Rough set theory: a data mining tool for semiconductor manufacturing , 2001 .

[15]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[16]  Chun-Che Huang,et al.  Rough set approach to case-based reasoning application , 2004, Expert Syst. Appl..