Improving the Accuracy of Intrusion Detection Using GAR-Forest with Feature Selection

Intrusion detection systems (IDS) are designed to detect malicious activities in a large-scale infrastructure. Many classification methods have been proposed to improve the classification accuracy of IDS. In this paper, we have applied greedy randomized adaptive search procedure with annealed randomness—Forest (GAR-Forest), a novel tree ensemble technique, with feature selection to improve classification accuracy of IDS. GAR-forest uses metaheuristic GRASP with annealed randomness to increase the diversity of ensemble. We used NSL-KDD datasets to study the classification accuracy of GAR-forest for both binary and multi-class classification problems. The results show that GAR-forest performs better when compared with random forest, C4.5, naive Bayes and multilayer perceptron for binary and multi-class classification problem achieving 82.3989 and 77.2622 % accuracy, respectively, while classifying test data. We have also applied feature selection procedures, such as information gain, symmetrical uncertainty and correlation-based feature subset, to select relevant features for improving the accuracy of GAR-forest. GAR-forest with symmetrical uncertainty yields 85.0559 % accuracy using 32 features for binary classification problem and information gain yields accuracy of 78.9035 % using 10 features for multi-class classification problem. GAR-forest is found to be relatively much faster than multilayer perceptron though it is slower than naive Bayes, random forest and C4.5 algorithm. The metaheuristic GRASP procedure enables GAR-forest to reach the global optimal solution which greedy deterministic approaches fail to reach.