Network Intrusion Detection with StackNet: A phi coefficient Based Weak Learner Selection Approach

Network intrusion detection is a subject of great concern as technology advances. Ensemble models that put together many base learners have been widely used to advance intrusion detection. Nevertheless, a random collection of base learners is challenging. The Matthews correlation coefficient (MCC) is an effective measure for detecting associations between variables in many fields; however, very few studies in network intrusion detection and ensemble studies have applied MCC in selecting base learners to the best of the authors’ knowledge. In this paper, we propose a correlation-based classifier selection using the MCC technique to advance the classification performance of the ensemble model under a StackNet strategy (named MCC-Stacknet) for network intrusion detection. Specifically, the MCC-StackNet model sought to improve the association between the prediction accuracy and diversity of base classifiers. We compare our proposed MCC-StackNet with five other ensemble models and two stand-alone state-of-the-art classifiers commonly used in network intrusion detection based on accuracy, AUC, recall, precision, F1-score and Kappa evaluation metrics. The experimental results with open-source data from Kaggle show that the MCC-StackNet model has a higher probability of correctly identifying unauthorised network traffic at 99.73% accuracy than the Xgboost (97.61%), Catboost (97.49%), LightGMB (%), GBC (97.63%), RF (97.97%), ET (95.82%), DT (96.95%) and KNN (95.56), making MCC-StackNet an efficient and better intrusion detection model.