Survey and Taxonomy of Feature Selection Algorithms in Intrusion Detection System

The Intrusion detection system deals with huge amount of data which contains irrelevant and redundant features causing slow training and testing process, higher resource consumption as well as poor detection rate. Feature selection, therefore, is an important issue in intrusion detection. In this paper we introduce concepts and algorithms of feature selection, survey existing feature selection algorithms in intrusion detection systems, group and compare different algorithms in three broad categories: filter, wrapper, and hybrid. We conclude the survey by identifying trends and challenges of feature selection research and development in intrusion detection system.

[1]  L. N. Kanal,et al.  Handbook of Statistics, Vol. 2. Classification, Pattern Recognition and Reduction of Dimensionality. , 1985 .

[2]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[3]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[4]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[5]  Jacek M. Zurada,et al.  Advances in Neural Networks - ISNN 2006, Third International Symposium on Neural Networks, Chengdu, China, May 28 - June 1, 2006, Proceedings, Part I , 2006, International Symposium on Neural Networks.

[6]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[7]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[8]  Keki B. Irani,et al.  Multi-interval discretization of continuos attributes as pre-processing for classi cation learning , 1993, IJCAI 1993.

[9]  Keinosuke Fukunaga,et al.  A Branch and Bound Algorithm for Feature Subset Selection , 1977, IEEE Transactions on Computers.

[10]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[11]  Andrew H. Sung,et al.  Comparison of Neural Networks and Support Vector Machines in Intrusion Detection , 2002 .

[12]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[13]  Dong Seong Kim,et al.  Toward Modeling Lightweight Intrusion Detection System Through Correlation-Based Hybrid Feature Selection , 2005, CISC.

[14]  Dong Seong Kim,et al.  Building Lightweight Intrusion Detection System Based on Random Forest , 2006, ISNN.

[15]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[16]  Andrew H. Sung,et al.  Ranking importance of input parameters of neural networks , 1998 .

[17]  Alessandro Verri,et al.  Pattern Recognition with Support Vector Machines , 2002, Lecture Notes in Computer Science.

[18]  Huan Liu,et al.  Toward integrating feature selection algorithms for classification and clustering , 2005, IEEE Transactions on Knowledge and Data Engineering.

[19]  Clifford A. Lynch,et al.  Information Networking , 1994 .

[20]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[21]  Mark A. Hall,et al.  Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning , 1999, ICML.

[22]  Christopher Krügel,et al.  Stateful intrusion detection for high-speed network's , 2002, Proceedings 2002 IEEE Symposium on Security and Privacy.

[23]  William H. Press,et al.  Numerical recipes in C , 2002 .

[24]  H. Hotelling Analysis of a complex of statistical variables into principal components. , 1933 .

[25]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[26]  Zhang Yi,et al.  Advances in Neural Networks - ISNN 2005, Second International Symposium on Neural Networks, Chongqing, China, May 30 - June 1, 2005, Proceedings, Part II , 2005, ISNN.

[27]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[28]  A.H. Sung,et al.  Identifying important features for intrusion detection using support vector machines and neural networks , 2003, 2003 Symposium on Applications and the Internet, 2003. Proceedings..

[29]  Dong Seong Kim,et al.  Fusions of GA and SVM for Anomaly Detection in Intrusion Detection System , 2005, ISNN.

[30]  Dong Seong Kim,et al.  Network-Based Intrusion Detection with Support Vector Machines , 2003, ICOIN.

[31]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[32]  Huan Liu,et al.  Consistency Based Feature Selection , 2000, PAKDD.

[33]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[34]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[35]  David G. Stork,et al.  Pattern Classification , 1973 .

[36]  Justin Doak,et al.  An evaluation of feature selection methods and their application to computer security , 1992 .

[37]  Thomas G. Dietterich,et al.  Learning Boolean Concepts in the Presence of Many Irrelevant Features , 1994, Artif. Intell..

[38]  James R. Gattiker,et al.  Anomaly Detection Enhanced Classification in Computer Intrusion Detection , 2002, SVM.

[39]  Pat Langley,et al.  Selection of Relevant Features and Examples in Machine Learning , 1997, Artif. Intell..

[40]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[41]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[42]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .