Intrusion detection systems (IDS) are designed to detect malicious activities in a large-scale infrastructure. Many classification methods have been proposed to improve the classification accuracy of IDS. In this paper, we have applied greedy randomized adaptive search procedure with annealed randomness—Forest (GAR-Forest), a novel tree ensemble technique, with feature selection to improve classification accuracy of IDS. GAR-forest uses metaheuristic GRASP with annealed randomness to increase the diversity of ensemble. We used NSL-KDD datasets to study the classification accuracy of GAR-forest for both binary and multi-class classification problems. The results show that GAR-forest performs better when compared with random forest, C4.5, naive Bayes and multilayer perceptron for binary and multi-class classification problem achieving 82.3989 and 77.2622 % accuracy, respectively, while classifying test data. We have also applied feature selection procedures, such as information gain, symmetrical uncertainty and correlation-based feature subset, to select relevant features for improving the accuracy of GAR-forest. GAR-forest with symmetrical uncertainty yields 85.0559 % accuracy using 32 features for binary classification problem and information gain yields accuracy of 78.9035 % using 10 features for multi-class classification problem. GAR-forest is found to be relatively much faster than multilayer perceptron though it is slower than naive Bayes, random forest and C4.5 algorithm. The metaheuristic GRASP procedure enables GAR-forest to reach the global optimal solution which greedy deterministic approaches fail to reach.
[1]
L. A. Smith,et al.
Feature Subset Selection: A Correlation Based Filter Approach
,
1997,
ICONIP.
[2]
Leo Breiman,et al.
Random Forests
,
2001,
Machine Learning.
[3]
H. S. Hota,et al.
Data Mining Approach for Developing Various Models Based on Types of Attack and Feature Selection as Intrusion Detection Systems (IDS)
,
2013,
ICACNI.
[4]
D. Lalitha Bhaskari,et al.
Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction
,
2013,
2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies.
[5]
H. S. Hota,et al.
Decision Tree Techniques Applied on NSL-KDD Data and Its Comparison with Various Feature Selection Techniques
,
2014
.
[6]
Ian H. Witten,et al.
The WEKA data mining software: an update
,
2009,
SKDD.
[7]
A. K. Pujari,et al.
Data Mining Techniques
,
2006
.
[8]
Juan José Rodríguez Diez,et al.
Tree ensemble construction using a GRASP-based heuristic and annealed randomness
,
2014,
Inf. Fusion.