A Filter Based Feature Selection Approach in MSVM Using DCA and Its Application in Network Intrusion Detection

We develop a filter based feature selection approach in Multi-classification by optimizing the so called Generic Feature Selection (GeFS) measure and then using Multi Support Vector Machine (MSVM) classifiers. The problem is first formulated as a polynomial mixed 0-1 fractional programming and then equivalently transformed into a mixed 0-1 linear programming (M01LP) problem. DCA (Difference of Convex functions Algorithm), an innovative approach in nonconvex programming framework, is investigated to solve the M01LP problem. The proposed algorithm is applied on Intrusion Detection Systems (IDSs) and experiments are conducted through the benchmark KDD Cup 1999 dataset which contains millions of connection records audited and includes a wide variety of intrusions simulated in a military network environment. We compare our method with an embedded based method for MSVM using l 2 − l 0 regularizer. Preliminary numerical results show that the proposed algorithm is comparable with l 2 − l 0 regularizer MSVM on the ability of classification but requires less computation.

[1]  Ching-Ter Chang,et al.  On the polynomial mixed 0-1 fractional programming problems , 2001, Eur. J. Oper. Res..

[2]  Le Thi Hoai An,et al.  A D.C. Optimization Algorithm for Solving the Trust-Region Subproblem , 1998, SIAM J. Optim..

[3]  Li Guo,et al.  Survey and Taxonomy of Feature Selection Algorithms in Intrusion Detection System , 2006, Inscrypt.

[4]  T. P. Dinh,et al.  Convex analysis approach to d.c. programming: Theory, Algorithm and Applications , 1997 .

[5]  Tao Pham Dinh,et al.  Exact penalty in d.c. programming , 1999 .

[6]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[7]  Le Thi Hoai An,et al.  Network Intrusion Detection Based on Multi-Class Support Vector Machine , 2012, ICCCI.

[8]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[9]  Le Thi Hoai An,et al.  DC programming in communication systems: challenging problems and methods , 2014, Vietnam Journal of Computer Science.

[10]  Le Thi Hoai An,et al.  Exact penalty and error bounds in DC programming , 2012, J. Glob. Optim..

[11]  Slobodan Petrovic,et al.  Reliability in A Feature-Selection Process for Intrusion Detection , 2012 .

[12]  Le Thi Hoai An,et al.  Efficient Algorithms for Feature Selection in Multi-class Support Vector Machine , 2013, Advanced Computational Methods for Knowledge Engineering.

[13]  Slobodan Petrovic,et al.  Towards a Generic Feature-Selection Measure for Intrusion Detection , 2010, 2010 20th International Conference on Pattern Recognition.

[14]  Le Thi Hoai An,et al.  Recent Advances in DC Programming and DCA , 2013, Trans. Comput. Collect. Intell..

[15]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ching-Ter Chang,et al.  An efficient linearization approach for mixed-integer problems , 2000, Eur. J. Oper. Res..

[17]  Tadeusz M. Szuba,et al.  Computational Collective Intelligence , 2001, Lecture Notes in Computer Science.