Machine Learning and Cyber Security

The application of machine learning (ML) technique in cyber-security is increasing than ever before. Starting from IP traffic classification, filtering malicious traffic for intrusion detection, ML is the one of the promising answers that can be effective against zero day threats. New research is being done by use of statistical traffic characteristics and ML techniques. This paper is a focused literature survey of machine learning and its application to cyber analytics for intrusion detection, traffic classification and applications such as email filtering. Based on the relevance and the number of citation each methods were identified and summarized. Because datasets are an important part of the ML approaches some well know datasets are also mentioned. Some recommendations are also provided on when to use a given algorithm. An evaluation of four ML algorithms has been performed on MODBUS data collected from a gas pipeline. Various attacks have been classified using the ML algorithms and finally the performance of each algorithm have been assessed.

[1]  B. B. Gupta,et al.  A Survey of Phishing Email Filtering Techniques , 2013, IEEE Communications Surveys & Tutorials.

[2]  Grenville J. Armitage,et al.  A survey of techniques for internet traffic classification using machine learning , 2008, IEEE Communications Surveys & Tutorials.

[3]  R.K. Cunningham,et al.  Evaluating intrusion detection systems: the 1998 DARPA off-line intrusion detection evaluation , 2000, Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00.

[4]  Mohamed Ben Ahmed,et al.  A Framework for an Adaptive Intrusion Detection System using Bayesian Network , 2007, 2007 IEEE Intelligence and Security Informatics.

[5]  Richard Lippmann,et al.  The 1999 DARPA off-line intrusion detection evaluation , 2000, Comput. Networks.

[6]  Bogdan Trawinski,et al.  Comparative Analysis of Premises Valuation Models Using KEEL, RapidMiner, and WEKA , 2009, ICCCI.

[7]  Carlos Martín-Vide,et al.  Evolutionary Design of Intrusion Detection Programs , 2007, Int. J. Netw. Secur..

[8]  Gabriel Maciá-Fernández,et al.  Anomaly-based network intrusion detection: Techniques, systems and challenges , 2009, Comput. Secur..

[9]  Robert J. Hammell,et al.  Converting PCAPs into Weka mineable data , 2014, 15th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD).

[10]  Ali A. Ghorbani,et al.  A detailed analysis of the KDD CUP 99 data set , 2009, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications.

[11]  Gilbert Hendry,et al.  Intrusion signature creation via clustering anomalies , 2008, SPIE Defense + Commercial Sensing.

[12]  James Cannady,et al.  Artificial Neural Networks for Misuse Detection , 1998 .

[13]  Erhan Guven,et al.  A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection , 2016, IEEE Communications Surveys & Tutorials.

[14]  Christopher Krügel,et al.  Using Decision Trees to Improve Signature-Based Intrusion Detection , 2003, RAID.

[15]  Vir V. Phoha,et al.  Investigating hidden Markov models capabilities in anomaly detection , 2005, ACM-SE 43.

[16]  Mark A. Buckner,et al.  An Evaluation of Machine Learning Methods to Detect Malicious SCADA Communications , 2013, 2013 12th International Conference on Machine Learning and Applications.

[17]  Aiko Pras,et al.  An Overview of IP Flow-Based Intrusion Detection , 2010, IEEE Communications Surveys & Tutorials.

[18]  Salvatore J. Stolfo,et al.  Using artificial anomalies to detect unknown and known network intrusions , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[19]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.