CyberLearning: Effectiveness Analysis of Machine Learning Security Modeling to Detect Cyber-Anomalies and Multi-Attacks

Abstract Detecting cyber-anomalies and attacks are becoming a rising concern these days in the domain of cybersecurity. The knowledge of artificial intelligence, particularly, the machine learning techniques can be used to tackle these issues. However, the effectiveness of a learning-based security model may vary depending on the security features and the data characteristics. In this paper, we present “CyberLearning”, a machine learning-based cybersecurity modeling with correlated-feature selection, and a comprehensive empirical analysis on the effectiveness of various machine learning based security models. In our CyberLearning modeling, we take into account a binary classification model for detecting anomalies , and multi-class classification model for various types of cyber-attacks. To build the security model, we first employ the popular ten machine learning classification techniques , such as naive Bayes, Logistic regression, Stochastic gradient descent , K-nearest neighbors, Support vector machine, Decision Tree, Random Forest, Adaptive Boosting, eXtreme Gradient Boosting, as well as Linear discriminant analysis. We then present the artificial neural network-based security model considering multiple hidden layers. The effectiveness of these learning-based security models is examined by conducting a range of experiments utilizing the two most popular security datasets, UNSW-NB15 and NSL-KDD. Overall, this paper aims to serve as a reference point for data-driven security modeling through our experimental analysis and findings in the context of cybersecurity.

[1]  Christopher Krügel,et al.  Bayesian event classification for intrusion detection , 2003, 19th Annual Computer Security Applications Conference, 2003. Proceedings..

[2]  Farrukh Aslam Khan,et al.  A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection , 2018, Cluster Computing.

[3]  Iqbal H. Sarker Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective , 2021, SN Computer Science.

[4]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[5]  Kajal Rai,et al.  Decision Tree Based Algorithm for Intrusion Detection , 2016 .

[6]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[7]  Phurivit Sangkatsanee,et al.  Practical real-time intrusion detection using machine learning approaches , 2011, Comput. Commun..

[8]  Nasser Yazdani,et al.  Mutual information-based feature selection for intrusion detection systems , 2011, J. Netw. Comput. Appl..

[9]  Shubha Puthran,et al.  Intrusion Detection Using Improved Decision Tree Algorithm with Binary and Quad Split , 2016, SSCC.

[10]  Vivek Kumar Sharma,et al.  An Intrusion Detection System using KNN-ACO Algorithm , 2017 .

[11]  Iqbal H. Sarker,et al.  Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage , 2019, Journal of Big Data.

[12]  Anamika Yadav,et al.  Decision Tree Based Intrusion Detection System for NSL-KDD Dataset , 2017 .

[13]  Iqbal H. Sarker,et al.  AI-Driven Cybersecurity: An Overview, Security Intelligence Modeling and Research Directions , 2021, SN Computer Science.

[14]  Dewan Md Farid,et al.  Feature selection and intrusion classification in NSL-KDD cup 99 dataset employing SVMs , 2014, The 8th International Conference on Software, Knowledge, Information Management and Applications (SKIMA 2014).

[15]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[17]  S. Selvakumar,et al.  Distributed denial of service attack detection using an ensemble of neural classifier , 2011, Comput. Commun..

[18]  S. Sathiya Keerthi,et al.  Improvements to Platt's SMO Algorithm for SVM Classifier Design , 2001, Neural Computation.

[19]  Lin Yang,et al.  A Survey on the Development of Self-Organizing Maps for Unsupervised Intrusion Detection , 2021, Mob. Networks Appl..

[20]  Iqbal H. Sarker,et al.  IntruDTree: A Machine Learning Based Cyber Security Intrusion Detection Model , 2020, Symmetry.

[21]  Radu State,et al.  Machine Learning Approach for IP-Flow Record Anomaly Detection , 2011, Networking.

[22]  R. Agarwal Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[23]  Pat Langley,et al.  Estimating Continuous Distributions in Bayesian Classifiers , 1995, UAI.

[24]  S. Thamarai Selvi,et al.  DDoS detection and analysis in SDN-based environment using support vector machine classifier , 2014, 2014 Sixth International Conference on Advanced Computing (ICoAC).

[25]  Iqbal H. Sarker,et al.  Cybersecurity data science: an overview from machine learning perspective , 2020, Journal of Big Data.

[26]  Iqbal H. Sarker,et al.  ABC-RuleMiner: User behavioral rule-based machine learning method for context-aware intelligent services , 2020, J. Netw. Comput. Appl..

[27]  Chunhua Wang,et al.  Machine Learning and Deep Learning Methods for Cybersecurity , 2018, IEEE Access.

[28]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[29]  Aurélien Géron,et al.  Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems , 2017 .

[30]  Shahram Sarkani,et al.  A network intrusion detection system based on a Hidden Naïve Bayes multiclass classifier , 2012, Expert Syst. Appl..

[31]  N. Sneha,et al.  Analysis of diabetes mellitus for early prediction using optimal features selection , 2019, Journal of Big Data.

[32]  Salem Benferhat,et al.  A Naive Bayes Approach for Detecting Coordinated Attacks , 2008, 2008 32nd Annual IEEE International Computer Software and Applications Conference.

[33]  M. M. A. Hashem,et al.  Attack and anomaly detection in IoT sensors in IoT sites using machine learning approaches , 2019, Internet Things.

[34]  Vineet Richariya,et al.  Intrusion Detection in KDD99 Dataset using SVM-PSO and Feature Reduction with Information Gain , 2014 .

[35]  Yali Amit,et al.  Shape Quantization and Recognition with Randomized Trees , 1997, Neural Computation.

[36]  Darragh O'Brien,et al.  Machine Learning for Automatic Defence Against Distributed Denial of Service Attacks , 2007, 2007 IEEE International Conference on Communications.

[37]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[38]  Paulus Insap Santosa,et al.  Machine Learning-Based IoT-Botnet Attack Detection with Sequential Architecture † , 2020, Sensors.

[39]  Antonio Pescapè,et al.  A cascade architecture for DoS attacks detection based on the wavelet transform , 2009, J. Comput. Secur..

[40]  Iqbal H. Sarker A Machine Learning based Robust Prediction Model for Real-life Mobile Phone Data , 2019, Internet Things.

[41]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[42]  Rakhi D. Wajgi,et al.  Classification of Attacks Using Support Vector Machine (SVM) on KDDCUP'99 IDS Database , 2015, 2015 Fifth International Conference on Communication Systems and Network Technologies.

[43]  Donald E. Brown,et al.  Identifying malicious botnet traffic using logistic regression , 2018, 2018 Systems and Information Engineering Design Symposium (SIEDS).

[44]  Iqbal H. Sarker,et al.  RecencyMiner: mining recency-based personalized behavior from contextual smartphone data , 2019, Journal of Big Data.

[45]  Nour Moustafa,et al.  UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set) , 2015, 2015 Military Communications and Information Systems Conference (MilCIS).

[46]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[47]  Nivethitha Somu,et al.  An efficient intrusion detection technique based on support vector machine and improved binary gravitational search algorithm , 2019, Artificial Intelligence Review.

[48]  A. O. Jimoh Anomaly Intrusion Detection Using an Hybrid Of Decision Tree And K-Nearest Neighbor , 2015 .

[49]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[50]  Yongdae Kim,et al.  A machine learning framework for network anomaly detection using SVM and GA , 2005, Proceedings from the Sixth Annual IEEE SMC Information Assurance Workshop.

[51]  Iraj Mahdavi,et al.  Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms , 2019, J. King Saud Univ. Comput. Inf. Sci..

[52]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[53]  S. Cessie,et al.  Ridge Estimators in Logistic Regression , 1992 .

[54]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[55]  Jong Hyuk Park,et al.  DTB-IDS: an intrusion detection system based on decision tree using behavior analysis for preventing APT attacks , 2015, The Journal of Supercomputing.

[56]  Manas Ranjan Patra,et al.  NETWORK INTRUSION DETECTION USING NAÏVE BAYES , 2007 .

[57]  Ehsan Namjoo,et al.  LR-HIDS: logistic regression host-based intrusion detection system for cloud environments , 2018, Journal of Ambient Intelligence and Humanized Computing.

[58]  Pirooz Shamsinejad,et al.  Intrusion Detection using a Novel Hybrid Method Incorporating an Improved KNN , 2017 .

[59]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[60]  Iqbal H. Sarker Context-aware rule learning from smartphone data: survey, challenges and future directions , 2019, Journal of Big Data.

[61]  Dharmaraj R. Patil,et al.  Implementation of network intrusion detection system using variant of decision tree algorithm , 2015, 2015 International Conference on Nascent Technologies in the Engineering Field (ICNTE).

[62]  Yinhui Li,et al.  An efficient intrusion detection system based on support vector machines and gradually feature removal method , 2012, Expert Syst. Appl..

[63]  Iqbal H. Sarker,et al.  Mobile Data Science and Intelligent Apps: Concepts, AI-Based Modeling and Research Directions , 2020, Mobile Networks and Applications.

[64]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[65]  Ali Alqazzaz,et al.  AD-IoT: Anomaly Detection of IoT Cyberattacks in Smart City Using Machine Learning , 2019, 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC).

[66]  Vishnu S. Pendyala,et al.  Machine Learning Algorithms , 2018, Optimization Techniques and Applications with Examples.

[67]  Jemal H. Abawajy,et al.  Using feature selection for intrusion detection system , 2012, 2012 International Symposium on Communications and Information Technologies (ISCIT).