TOPSIS Based Multi-Criteria Decision Making of Feature Selection Techniques for Network Traffic Dataset

Intrusion detection systems (IDS) have to process millions of packets with many features, which delay the detection of anomalies. Sampling and feature selection may be used to reduce computation time and hence minimizing intrusion detection time. This paper aims to suggest some feature selection algorithm on the basis of The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). TOPSIS is used to suggest one or more choice(s) among some alternatives, having many attributes. Total ten feature selection techniques have been used for the analysis of KDD network dataset. Three classifiers namely Naïve Bayes, J48 and PART have been considered for this experiment using Weka data mining tool. Ranking of the techniques using TOPSIS have been calculated by using MATLAB as a tool. Out of these techniques Filtered Subset Evaluation has been found suitable for intrusion detection in terms of very less computational time with acceptable accuracy. KeywordFeature selection, Multi criteria decision making, TOPSIS, Intrusion Detection System and Network Traffic Classification

[1]  Cengiz Kahraman,et al.  Fuzzy multi‐attribute cost–benefit analysis of e‐services , 2007, Int. J. Intell. Syst..

[2]  Chandrika Kamath,et al.  Feature selection in scientific applications , 2004, KDD.

[3]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[4]  F. Hosseinzadeh Lotfi,et al.  Extension of TOPSIS for decision-making problems with interval data: Interval efficiency , 2009, Math. Comput. Model..

[5]  Walter Daelemans,et al.  Combined Optimization of Feature Selection and Algorithm Parameters in Machine Learning of Language , 2003, ECML.

[6]  Zexuan Zhu,et al.  Wrapper–Filter Feature Selection Algorithm Using a Memetic Framework , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[7]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[8]  Gang Kou,et al.  An integrated expert system for fast disaster assessment , 2014, Comput. Oper. Res..

[9]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[10]  Jesús S. Aguilar-Ruiz,et al.  Incremental wrapper-based gene selection from microarray data for cancer classification , 2006, Pattern Recognit..

[11]  Ahmad Makui,et al.  A novel approach to determine cell formation, intracellular machine layout and cell layout in the CMS problem based on TOPSIS method , 2009, Comput. Oper. Res..

[12]  Hojjat Adeli,et al.  Hybridizing principles of TOPSIS with case-based reasoning for business failure prediction , 2011, Comput. Oper. Res..

[13]  S. K. Goyal,et al.  A multi-criteria decision making approach for location planning for urban distribution centers under uncertainty , 2011, Math. Comput. Model..

[14]  Joshua D. Knowles,et al.  Feature subset selection in unsupervised learning via multiobjective optimization , 2006 .

[15]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[16]  Huan Liu,et al.  Chi2: feature selection and discretization of numeric attributes , 1995, Proceedings of 7th IEEE International Conference on Tools with Artificial Intelligence.

[17]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[18]  Chih-Hung Wang,et al.  A multiattribute GDSS for aiding problem-solving , 2004 .

[19]  David Zhang,et al.  Personal recognition using hand shape and texture , 2006, IEEE Transactions on Image Processing.

[20]  Lloyd A. Smith,et al.  Practical feature subset selection for machine learning , 1998 .

[21]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[22]  Harish Kumar,et al.  Traffic Analysis of Campus Network for Classification of Broadcast Data , 2013 .

[23]  Ilker Akgun,et al.  A multi-methodological approach for shipping registry selection in maritime transportation industry , 2009, Math. Comput. Model..

[24]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[25]  Peter Kulchyski and , 2015 .

[26]  Yusuf Tansel İç,et al.  An experimental design approach using TOPSIS method for the selection of computer-integrated manufacturing technologies , 2012 .

[27]  Zhishen Ye,et al.  Sliced inverse moment regression using weighted chi-squared tests for dimension reduction , 2008, 0804.1143.

[28]  Chung-Hsing Yeh,et al.  Inter-company comparison using modified TOPSIS with objective weights , 2000, Comput. Oper. Res..

[29]  Ian H. Witten,et al.  Generating Accurate Rule Sets Without Global Optimization , 1998, ICML.

[30]  Hsu-Shih Shih,et al.  A hybrid MCDM model for strategic vendor selection , 2006, Math. Comput. Model..

[31]  Andries Petrus Engelbrecht,et al.  A decision rule-based method for feature selection in predictive data mining , 2010, Expert Syst. Appl..

[32]  Gwo-Hshiung Tzeng,et al.  Combining grey relation and TOPSIS concepts for selecting an expatriate host country , 2004, Math. Comput. Model..